You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shawn Heisey (Created) (JIRA)" <ji...@apache.org> on 2011/11/10 20:14:51 UTC

[jira] [Created] (SOLR-2889) Implement Adaptive Replacement Cache

Implement Adaptive Replacement Cache
------------------------------------

                 Key: SOLR-2889
                 URL: https://issues.apache.org/jira/browse/SOLR-2889
             Project: Solr
          Issue Type: New Feature
          Components: search
    Affects Versions: 3.4
            Reporter: Shawn Heisey
            Priority: Minor


Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:

http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache

If it's feasible, this could be a good addition to Solr/Lucene.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148529#comment-13148529 ] 

Shawn Heisey commented on SOLR-2889:
------------------------------------

Two things:

1) After some thought, I have concluded that a straight LFU cache might fit my needs perfectly, and it's a baby step towards ARC.

2) I took a quick look at some of the code.  The code for cache trimming and warming is in ConcurrentLRUCache.java, but the hits seem to be tracked in {Fast}LRUCache.java.  I think this means that the first step would be to refactor things so that we have one or more base classes with common functionality, which are then extended or imported by smaller classes that implement LRU, LFU, and ARC.

Am I on the right track?  Does this need another issue?

                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Simon Willnauer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148082#comment-13148082 ] 

Simon Willnauer commented on SOLR-2889:
---------------------------------------

@yonik: I agree progress over perfection... this can happen afterwards

@shawn: please go ahead and hack on what you wanted to do, your contribution is more than welcome! You should start where and with what you feel comfortable and we gonna work towards improving our codebase. My idea I mentioned above with adding a module is a long term goal, lets concentrate on what this issue tries to achieve.

@jason: I agree with yonik, we have been there and we should not express our strong feelings loud in every issue possible. We'll get there its open source! 
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152641#comment-13152641 ] 

Shawn Heisey commented on SOLR-2889:
------------------------------------

My original approach wasn't working well, which is why I said I wasn't going to be able to do it.  Today I took a different approach, and the changes were pretty easy.  I just made copies of ConcurrentLRUCache.java and FastLRUCache.java, then renamed and massaged them into LFU versions.  The heart of what I did was remove lastAccessed and turned it into an AtomicLong named hits.

It does work as a cache for some simple hand-entered queries, but I need to do some more extensive testing to see if evictions and warming are working as expected before I upload it.  I think I'll temporarily stick in some println statements to watch what it's doing.

Some other things that need to be done that I'm not sure I'm qualified for (but I will attempt):

- Test code.
- Abstracting out the common parts into parent classes.

                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Jason Rutherglen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148269#comment-13148269 ] 

Jason Rutherglen commented on SOLR-2889:
----------------------------------------

Simon and Yonik, re-read what you wrote, have fun.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148014#comment-13148014 ] 

Yonik Seeley commented on SOLR-2889:
------------------------------------

bq. Put it in Lucene and NOT Solr. thanks.

We've been over this - improvements to Solr should not be blocked just because someone else desires the functionality in lucene.
It's up to the person doing the work, what their objectives are, how much time they have, etc.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Chris Male (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149215#comment-13149215 ] 

Chris Male commented on SOLR-2889:
----------------------------------

ConcurrentLRUCache has been moved out of SolrJ and into Solr Core in trunk and 3x in SOLR-2758.  Which source code checkout are you looking at?
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Simon Willnauer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147937#comment-13147937 ] 

Simon Willnauer commented on SOLR-2889:
---------------------------------------

having a more diverse caching infrastructure is certainly something solr & lucene could make use of. I'd like to see this kind of stuff in lucene too so lucene users can make use of this too. Maybe we can factor out the solr caches into a caching module by makeing them more generic?
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149208#comment-13149208 ] 

Yonik Seeley commented on SOLR-2889:
------------------------------------

bq. LRUCache relies on the alternate sort order on LinkedHashMap, so it would not be as simple to add usage tracking.

A single cache that tracks hits per entry would be enough I think - no need to add the capability to all existing cache types.

bq. The solrj tree seems like an odd place for ConcurrentLRUCache, because nothing else in that section uses it (directly at least).

Hmmm, never realized that.  Perhaps just due to the fact that it was in the util package?
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148124#comment-13148124 ] 

Shawn Heisey commented on SOLR-2889:
------------------------------------

@simon: I will certainly take a look, and I am encouraged by yonik's assessment that it's a simple change, but I have to say that I'm a Java newbie.  I hope that I can do it, but I'm not super optimistic.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254671#comment-13254671 ] 

Shawn Heisey commented on SOLR-2889:
------------------------------------

After thinking about what ended up being the final code for SOLR-2906, I know that I won't be able to tackle this, but I am wondering whether this is really necessary any more.  The timeDecay option on the LFU cache implementation could be viewed as an LRU tweak to the LFU cache, which I think fulfills my original goals even if it's not a true ARC cache.  Does that mean this issue should be closed?  I can't say.

I hope someone really smart is able to provide some serious speed optimization for the new LFU cache.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Lance Norskog (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148725#comment-13148725 ] 

Lance Norskog commented on SOLR-2889:
-------------------------------------

If you are working on the caches, better instrumentation would be a good feature. I would like to know how much 'churn' is happening: if I have 90% semi-permanent members and 10% constantly changing, that means my cache needs tuning.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Simon Willnauer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148133#comment-13148133 ] 

Simon Willnauer commented on SOLR-2889:
---------------------------------------

Shawn, don't worry we can iterate over you patches! I am optimistic! Welcome to solr & lucene :)
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149236#comment-13149236 ] 

Shawn Heisey commented on SOLR-2889:
------------------------------------

bq. ConcurrentLRUCache has been moved out of SolrJ and into Solr Core in trunk and 3x in SOLR-2758. Which source code checkout are you looking at?

I'm looking at 3.4.0, the version I'm running.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148078#comment-13148078 ] 

Yonik Seeley commented on SOLR-2889:
------------------------------------

bq. It's reasonable to expect that process to happen on new additions to what is a singular project.

Sorry, no. It's completely unrelated.  Adding conditional numHits per entry to FastLRUCache would be a small simple change.  Trying to figure out what/how to refactor Solr's current caching into a module is at least an order of magnitude (or two) harder.  You can tackle that anytime you feel like it, but don't force it on anyone else.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149179#comment-13149179 ] 

Shawn Heisey commented on SOLR-2889:
------------------------------------

FastLRUCache uses ConcurrentLRUCache, which includes a full class for a cache entry.  A new member could be added to CacheEntry pretty easily to track usage, but the rest of the code would have to me modified to use it.  Making sure it's all thread-safe would probably be the hard part.

LRUCache relies on the alternate sort order on LinkedHashMap, so it would not be as simple to add usage tracking.

Something I noticed along the way: The solrj tree seems like an odd place for ConcurrentLRUCache, because nothing else in that section uses it (directly at least).

                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149949#comment-13149949 ] 

Shawn Heisey commented on SOLR-2889:
------------------------------------

After a close look, I find overall understanding elusive, and I have been told by my employer not to spend a lot of time on it.  It must be relegated to my spare time, which is pretty scarce.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Jason Rutherglen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148003#comment-13148003 ] 

Jason Rutherglen commented on SOLR-2889:
----------------------------------------

+1 - Put it in Lucene and NOT Solr.  thanks.  

When this is implemented, using Google collections should be developed as well (which appropriately jettisons the cache values before OOM), ala the previously created though not committed SOLR-1513.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255727#comment-13255727 ] 

Shawn Heisey commented on SOLR-2889:
------------------------------------

bq. Would you mind posting some information about the results of your work and how much performance gain you made. If you have benchmark results this would be ideal. Did you notice any increase/decrease in memory and CPU demand?

I haven't done any extensive testing.  The testing that I did do for SOLR-2906 suggested that the LFU cache did not offer any performance benefit over LRU, but that it didn't really cause a performance detriment either.  I think this means that the idea was sound, but any speedups gained from the different methodology were lost because of the basic and non-optimized implementation.

It was not a definitive test - I have two copies of my production distributed index for redundancy purposes, with haproxy doing load balancing between the two.  I can set one set of servers to LFU and the other to LRU, but it's production, so the two sets of servers never receive the same queries and I don't really want to try any isolation tests on production equipment.  My testbed is too small for a doing tests with all production data - one server with all resources smaller than production.  I could do some tests with smaller data sets that will fit entirely in RAM, but that will take a lot of planning that I currently don't have time to do.

The LRU cache is highly optimized for speed, but I didn't really understand the optimizations and they don't apply to LFU as far as I can tell.  At this time I am still using LRU cache because I don't dare change the production configuration without authorization and I can't leave production servers in test mode for very long.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Jason Rutherglen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148065#comment-13148065 ] 

Jason Rutherglen commented on SOLR-2889:
----------------------------------------

Yonik, Take a step back.  No analyzers are in Solr, and the caching and other 'parts' will be moved out.  It's reasonable to expect that process to happen on new additions to what is a singular project.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "austin collins (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255108#comment-13255108 ] 

austin collins commented on SOLR-2889:
--------------------------------------

@Shawn I have been reading though your comments. Well done for taking on this project, it sounds like a good achievement and addition to the open-s-c. 

You set out with a goal, and I guess this was to make performance gains. However you don't sound confident you have achieved this when you ask for someone to provide some speed optimization. 

Would you mind posting some information about the results of your work and how much performance gain you made. If you have benchmark results this would be ideal. Did you notice any increase/decrease in memory and CPU demand?
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148880#comment-13148880 ] 

Shawn Heisey commented on SOLR-2889:
------------------------------------

@lance: To say I'm working on it is very much an overstatement.  I have taken a quick look, enough to know that I may be in over my head, requiring a lot of learning before diving in.  I will give it a try, but Yoda probably would not be impressed by the results.
                
> Implement Adaptive Replacement Cache
> ------------------------------------
>
>                 Key: SOLR-2889
>                 URL: https://issues.apache.org/jira/browse/SOLR-2889
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Currently Solr's caches are LRU, which doesn't look at hitcount to decide which entries are most important.  There is a method that takes both frequency and time of cache hits into account:
> http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
> If it's feasible, this could be a good addition to Solr/Lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org