You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shawn Heisey (Created) (JIRA)" <ji...@apache.org> on 2011/11/18 16:22:53 UTC

[jira] [Created] (SOLR-2906) Implement LFU Cache

Implement LFU Cache
-------------------

                 Key: SOLR-2906
                 URL: https://issues.apache.org/jira/browse/SOLR-2906
             Project: Solr
          Issue Type: Sub-task
          Components: search
    Affects Versions: 3.4
            Reporter: Shawn Heisey
            Priority: Minor


Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173378#comment-13173378 ] 

Erick Erickson edited comment on SOLR-2906 at 12/20/11 6:25 PM:
----------------------------------------------------------------

Right, there's a lot of code to wrap your head around and it can be bewildering. And compared to what you've already done, this is easy....

But lots of things are surprisingly easy in Solr, except when they're really hard <G>. 

This is just about allowing another parameter to be specified in the declaration in solrconfig.xml, similar to initialSize, autowarmCount, etc. cleanupThread is probably a good exemplar.

Take a look at LFUCache.java.init for how all of the other ones are parsed.

Then just pass that through to the ConcurrentLFUCache, conditionally doing the "ce.hits.set(ce.hitsCopy >>> 1);" line. Defaulting the timeDecay=true if not present.

So this should be relatively few lines of actual code, making some automated tests will actually take more time I suspect, as well as any Wiki documentation if you're feeling ambitious....
                
      was (Author: erickerickson):
    Right, there's a lot of code to wrap your head around and it can be bewildering. And compared to what you've already done, this is easy....

But lots of things are surprisingly easy in Solr, except when they're not really hard <G>. 

This is just about allowing another parameter to be specified in the declaration in solrconfig.xml, similar to initialSize, autowarmCount, etc. cleanupThread is probably a good exemplar.

Take a look at LFUCache.java.init for how all of the other ones are parsed.

Then just pass that through to the ConcurrentLFUCache, conditionally doing the "ce.hits.set(ce.hitsCopy >>> 1);" line. Defaulting the timeDecay=true if not present.

So this should be relatively few lines of actual code, making some automated tests will actually take more time I suspect, as well as any Wiki documentation if you're feeling ambitious....
                  
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153288#comment-13153288 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

bq. But really, it seems like you should disregard all the algorithmic stuff in LRU when implementing LFU. If you think you see a bug in the existing LRU stuff, you're going to have to spell it out for me a bit more.

I can't actually say that there is a bug, but I have to say that I'm really confused (in the LRU code) by what lastAccessed(Copy) actually is and how it works, and what the following code pieces from markAndSweep are doing with it, since wantToKeep and wantToRemove are entry counts:

{code}
long thisEntry = ce.lastAccessedCopy;
  <snip>
if (thisEntry > newestEntry - wantToKeep) {
  <snip>
} else if (thisEntry < oldestEntry + wantToRemove) { // entry in bottom group?
{code}

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154514#comment-13154514 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

bq. shawn, is it possible to upload a diff file (patch).

These are all new files, no files changed.  "svn diff" returns nothing.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153249#comment-13153249 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

I've been trying to find my bug.  Looking back at the original LRU implementation, I have no idea how it's working.

When a CacheEntry is created in the LRU code, one of the values sent in is an incremented stats.accessCounter, which gets called lastAccessed in the new object.  When it is later used in markAndSweep, it is used in simple math along with the number of items that we want to keep/remove.  This is very confusing, and I can't see how it could ever work.  It might be that part of the code is simply skipped because of the way the math happens to work out.

When I changed it around to use hitcounts, those counts are also used in the previously mentioned simple math, and I believe that results in some very weird behavior, such as removing most (or possibly all) of the cache entries.

It appears that this idea is a lot more complicated than I originally thought, and that the current code needs to be at least partially rewritten.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment: FastLFUCache.java
                ConcurrentLFUCache.java

Now markAndSweep builds a treeset of the least used items, then uses the treeset to evict entries.

This is almost guaranteed to be an inefficient way to do things.  If it works, I leave optimization to the experts.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, ConcurrentLFUCache.java, FastLFUCache.java, FastLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2906) Implement LFU Cache

Posted by "Yonik Seeley (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172605#comment-13172605 ] 

Yonik Seeley edited comment on SOLR-2906 at 12/19/11 9:01 PM:
--------------------------------------------------------------

Actually, I was thinking count >>= 1 (divide by 2).
We could make it optional (timeDecay=true), but my gut feeling says it should be on by default.
                
      was (Author: yseeley@gmail.com):
    Actually, I was thinking count >>= 1 (divide by 2).
We could make it optional, but my gut feeling says it should be on by default.
                  
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166407#comment-13166407 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

Some additional info: The index is composed of six large shard cores and a small one, running on two servers.  The total index size on each server (CentOS 6, java 1.6.0_29) is about 60GB.  Solr/Jetty has an 8GB heap.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson updated SOLR-2906:
---------------------------------

    Attachment: SOLR-2906.patch

Mostly cosmetic changes:

Changed acceptableLimit to acceptableSize to keep it named consistently

Formatted all the files

Implemented Yonik's aging suggestion (but no tests, there doesn't seem to be a clean way to implement a test here without creating debug-only code).

I'm not wholly convinced that dividing by 4 is the right thing to do here; it'll tend to flatten all the entries making removal somewhat arbitrary as after a few passes anything with hits in the low range will collapse to zero. That said, though, since the little adventure with lastAccessed, all entries with the same number of hits will be treated as LRU so I guess it works.

Marked code as experimental

Commented out some debugging code
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166399#comment-13166399 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

I finally got a chance to do some testing in production.  I have two distributed index chains, both running 3.5.0 with this patch and the one from SOLR-1972 applied.  The chains are updated independently, there is no replication.  It's not a truly definitive test, because my queries are load balanced between the two chains and I do have some hardware discrepancies.  I cannot create a valid test environment to compare the two caches as they should be compared, with identical queries going to two completely identical servers.

On average, commits made on chain A where filterCache is set to LFU and the servers have 48GB of RAM happen faster than those on chain B, with FastLRU and 64GB of RAM.  One of the two servers on chain A has slightly faster processors than its counterpart on chain B -- 2.83 GHz vs. 2.66 GHz.  The other two servers on both chains have 2.5 GHz processors.

I suspect that most of the potential gains that the LFU algorithm might be able to provide are swallowed by the very inefficient implementation.

If anyone has some thoughts for me to pursue, I will be happy to do so, but I am out of my own ideas.  I hope the patch will be committed.  It could use a lot of optimization and there's probably cosmetic cleanup to do.

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment: SOLR-2906.patch

I have added some text to CHANGES.TXT under 3.6.  Like before, my patch is against branch_3x.

Yonik, you may be right about lastAccessed.  I was striving for correctness on this first pass, but perhaps it's not worthwhile to care about that too much, just let Java's default functionality figure out which ones to evict.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Simon Willnauer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154482#comment-13154482 ] 

Simon Willnauer commented on SOLR-2906:
---------------------------------------

bq. IMHO, ready for review and possible inclusion. The javadoc and other comments were reviewed and modified, but not closely.

shawn, is it possible to upload a diff file (patch). If you are using a svn checkout make sure on add all new files (svn add) and then run svn diff > SOLR-2906.patch from the top level directory. This makes it easier to see what changed and we only have to apply a single file to test & review your changes. there is also a wiki for this: http://wiki.apache.org/lucene-java/HowToContribute (see How to create a patch)
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171641#comment-13171641 ] 

Erick Erickson commented on SOLR-2906:
--------------------------------------

What do the collective people who actually know the code think about this patch? From my perspective, the LFU is a classic alternative to LRU and seems like a fine addition. If there are no objections, I'll volunteer to commit this.

Shawn:
This is the kind of thing that could use two things:
1> text for CHANGES.txt describing the new functionality. If you could attach that as a separate patch file (or just e-mail it to me) that would be great. That file changes rapidly enough that I like to add the text at check-in time, it seems easier.

2> Adding a description to the Wiki since this is new functionality. Perhaps in http://wiki.apache.org/solr/SolrCaching?
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155267#comment-13155267 ] 

Yonik Seeley commented on SOLR-2906:
------------------------------------

bq. The cache size and the number of inserts don't match up. It's usually off by 10

Do you have any type of warming (autowarming or statically configured)?  Statistics are counted differently during warming.


                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154344#comment-13154344 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

I've re-added lastAccessed to the class, as a tiebreaker when hitcount is equal.

The test method prints out leastUsedItems and mostUsedItems.  Somehow, item number 50 is included in both.

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, ConcurrentLFUCache.java, ConcurrentLFUCache.java, FastLFUCache.java, FastLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment: FastLFUCache.java
                ConcurrentLFUCache.java

Here is the first crack at an LFU implementation.  There's some weirdness with negative numbers in the statistics, and I'm not sure that eviction and warming are working right, but I am having trouble getting a test environment fully operational.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153002#comment-13153002 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

I used branch_3x to create the above files.  I haven't even looked at trunk.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174475#comment-13174475 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

I must be dense.  I can figure out how to add the timeDecay option, but I can't figure out what section of code to enable/disable based on the value of timeDecay.  I've gone as far as doing a diff on my Nov 24th patch and the Dec 20th patch from Erick.  (doing diffs on diffs ... the world is going to explode!) The only differences I can see between the two is in whitespace/formatting.

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171903#comment-13171903 ] 

Yonik Seeley commented on SOLR-2906:
------------------------------------

IMO, this is appropriate for trunk.  If we want to commit to 3x, we should mark as experimental so we can change the default functionality if desired.

One concern I have with straight LFU is the lack of any kind of time sensitivity. For example, I ran some batch job that accessed the same filters a million times, and now they are stuck in the cache even though they haven't been used for hours or days.  Perhaps one way to handle this would be to do something like count>>=2 to everything when a cleaning out old entries.

I also wonder if it's worth keeping lastAccessed.  It's only valuable for breaking ties and I don't know how much that's actually needed.

Anyway, +1 to commit since this won't be used in the example solrconfig by default (and hence we can speed things up and tweak the algorithm before it possibly does get used by default).
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson updated SOLR-2906:
---------------------------------

    Attachment: SOLR-2906.patch

Here's what I had in mind, at least I *think* this will do but all I've done is insured that the code compiles and the current LFU test suite runs.

Look in the diff for "timeDecay".

This still needs some proof that the new parameter comes through from a schema file. Let me know if that presents a problem or if you can't get 'round to it, I might have some time over Christmas.

I think maybe you were under the impression that this had already been done and were looking for it to be in the code already?
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153082#comment-13153082 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

I will fully admit that I built the new cache type from the old code without really understanding what the code was doing, and now I am out of my depth.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment:     (was: LFUCache.java)
    
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment:     (was: FastLFUCache.java)
    
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153335#comment-13153335 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

Would it be possible to adapt the lastAccessed shortcut to LFU, or would it be simply best to remove that section of code?
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson updated SOLR-2906:
---------------------------------

    Attachment: SOLR-2906.patch

This should be the final patch. Added the stuff to actually get the parameter from solrconfig "timeDecay" which ages out the cache entries as we've discussed. Added tests to insure that it gets through from the config file.

Shawn: If you'd add some data to the Wiki about this new parameter, that would be a good thing.

If nobody objects, I'll probably check this in in the next couple of days. Since they're all new files, the patch will apply to both trunk and 3x cleanly.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195148#comment-13195148 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

bq. Did you ever update the Wiki with this new functionality? That'd be awesome....

Yes, I added LFUCache and the timeDecay option to the SolrCaching Wiki page.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 3.6, 4.0
>
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson updated SOLR-2906:
---------------------------------

    Attachment: SOLR-2906.patch

Updated patch that divides by 2 and adds a unit test for aging out.

Shawn:

Could you add in the optional time decay as Yonik suggests? I agree that it seems like the right thing is to have this on by default. At that point, I think it'll be ready to check in. We can add documentation as we can.

We could  also check it in as is and raise another JIRA.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173378#comment-13173378 ] 

Erick Erickson commented on SOLR-2906:
--------------------------------------

Right, there's a lot of code to wrap your head around and it can be bewildering. And compared to what you've already done, this is easy....

But lots of things are surprisingly easy in Solr, except when they're not really hard <G>. 

This is just about allowing another parameter to be specified in the declaration in solrconfig.xml, similar to initialSize, autowarmCount, etc. cleanupThread is probably a good exemplar.

Take a look at LFUCache.java.init for how all of the other ones are parsed.

Then just pass that through to the ConcurrentLFUCache, conditionally doing the "ce.hits.set(ce.hitsCopy >>> 1);" line. Defaulting the timeDecay=true if not present.

So this should be relatively few lines of actual code, making some automated tests will actually take more time I suspect, as well as any Wiki documentation if you're feeling ambitious....
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173345#comment-13173345 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

bq. Could you add in the optional time decay as Yonik suggests? I agree that it seems like the right thing is to have this on by default. At that point, I think it'll be ready to check in. We can add documentation as we can.

I've looked at what Yonik has said and cannot figure out what I'd have to do.  I'm not completely ignorant, but there is a lot that I don't know.  I am amazed I was able to get this put together at all.

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment: SOLR-2906.patch

You're right, I had thought it was already done.  Now that I see what you've done, and looked up what >>> does, it makes sense, but when I read what Yonik was saying I had no idea.

Added a note to CHANGES.txt, created what is hopefully the final diff.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172564#comment-13172564 ] 

Erick Erickson commented on SOLR-2906:
--------------------------------------

We need to keep lastAccessed, removing it introduces a bug. The CacheEntry.compareTo method looks like this:

    public int compareTo(CacheEntry<K, V> that) {
      if (this.hitsCopy == that.hitsCopy) {
        if (this.lastAccessedCopy == that.lastAccessedCopy) {
          return 0;
        }
        return this.lastAccessedCopy < that.lastAccessedCopy ? 1 : -1;
      }
      return this.hitsCopy < that.hitsCopy ? 1 : -1;
    }

which is guaranteed to return non-zero unless the the exact same object is being compared since lastAccessed(Copy) is monotonically increasing.

If we remove the lastAccessed, then any two elements having the same number of hits compare as equal. Which also means that tree insertions overwrite each other in methods like getMost/LeastUsedItems. I don't know of any cheaper amounts of overhead to carry along to prevent this.

I've made a few, mostly cosmetic changes while trying to understand the code, I'll check it in shortly.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155290#comment-13155290 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

I might have figured out the problem, and if I have, the cache code is fine.  I just checked the log from my most recent run and have found that there are two errors from invalid filter queries.  I think this means that when a filter is invalid, inserts gets incremented but size doesn't.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185197#comment-13185197 ] 

Erick Erickson edited comment on SOLR-2906 at 1/12/12 9:21 PM:
---------------------------------------------------------------

Fixed in 3x at r: 1230744
Fixed on trunk at r: 1230790 NOTE: r 1230741 was messed up.
 
Shawn:
Did you ever update the Wiki with this new functionality? That'd be awesome....
                
      was (Author: erickerickson):
    Fixed in 3x at r: 1230744
Fixed on trunk at r: 1230741

Shawn:
Did you ever update the Wiki with this new functionality? That'd be awesome....
                  
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 3.6, 4.0
>
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195324#comment-13195324 ] 

Erick Erickson commented on SOLR-2906:
--------------------------------------

Cool!

On Fri, Jan 27, 2012 at 2:16 PM, Shawn Heisey (Commented) (JIRA)

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 3.6, 4.0
>
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154465#comment-13154465 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

All known bugs found and fixed, unit test looks correct and passes.  This was created against branch_3x, but trunk probably won't be much different.

IMHO, ready for review and possible inclusion.  The javadoc and other comments were reviewed and modified, but not closely.

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153080#comment-13153080 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

Evictions definitely don't seem to be working right.  I finally got a benchmark script going.  I watched as the size of the filterCache climbed to the maximum size of 64.  On the next insert, it was suddenly only 7 entries, and the eviction counter had incremented from 290 to 348.  That seems really aggressive.  I seem to have done something wrong.

Once I got through with the benchmark script, I did a commit, at that moment the size was 50 out of 64.  After warming (autowarmCount 16), the cache size was 12, and both the hits and lookups were -12 (negative).
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment:     (was: FastLFUCache.java)
    
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment: TestLFUCache.java
                LFUCache.java
                ConcurrentLFUCache.java

I've renamed the user-facing class to LFUCache and created a test program based on the LRU version.  The tests are failing, though.  So far I can't figure out why.

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, ConcurrentLFUCache.java, ConcurrentLFUCache.java, FastLFUCache.java, FastLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155296#comment-13155296 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

I can't reproduce it with LFUCache or FastLRUCache by manually sending invalid queries, so that's the wrong idea.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment:     (was: TestLFUCache.java)
    
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson resolved SOLR-2906.
----------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0
                   3.6

Fixed in 3x at r: 1230744
Fixed on trunk at r: 1230741

Shawn:
Did you ever update the Wiki with this new functionality? That'd be awesome....
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 3.6, 4.0
>
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (SOLR-2906) Implement LFU Cache

Posted by "Erick Erickson (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson reassigned SOLR-2906:
------------------------------------

    Assignee: Erick Erickson
    
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155280#comment-13155280 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

The only static warming I have is a *:* search with a sort parameter (and no filter query), which precaches my sort.  I do have autowarming configured, but this behavior happens from initial solr startup.  Things seem to behave correctly with warming - size is autowarmCount, inserts are zero.

FastLRUCache behavior:
queryResultCache: size is one higher than inserts
documentCache: size is one higher than inserts
filterCache: size and inserts are identical

LFUCache behavior:
queryResultCache: size is one higher than inserts
documentCache: size is one higher than inserts
filterCache: inserts is higher than size by a variable amount

I've seen 10 (on two different runs) and 2 (on the most recent run) as the difference between inserts and size.

The FastLRUCache behavior is seen on my production servers with production queries, the LFUCache behavior is on my test server with a benchmark script providing the queries.  I suppose there might be something weird about my canned queries that makes the filterCache behave differently, but the original source of the queries was a production Solr log at level INFO.

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155258#comment-13155258 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

Something odd happens with the filterCache.  When things are first starting off, the cache size and the number of inserts don't match up.  It's usually off by 10, with more in the inserts.  This doesn't seem to happen with the other cache types, also using LFU.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment: SOLR-2906.patch

Updated patch.  One of the bugs I had to fix was in the least/most items methods, so that I added new items to the TreeSet before removing old ones, because the one it just added might have the same hitcount as entries already present.  Without checking the new entry, I couldn't know which entry was the right one to remove.

This change reverts it to a remove then add when the hitCounts are different in the right direction.  When they are equal, it still does the add before the remove.  By reducing the size of the set before adding a new member whenever possible, there is a possibility it can go faster.

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment: TestLFUCache.java
                LFUCache.java
                ConcurrentLFUCache.java
    
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153258#comment-13153258 ] 

Yonik Seeley commented on SOLR-2906:
------------------------------------

bq. I've been trying to find my bug. Looking back at the original LRU implementation, I have no idea how it's working.

Heh... it is pretty complex, and I did try to add a ton of comments because of that.
The basic idea is that I wanted to avoid an O(n*log( n )) solution of sorting everything and then discarding the lowest.
In my testing, it seems to work, and we often just need to take a singe O( n ) pass to evict all the entries we need.

The comment at the top is the most important to understanding how it works:

{code}
    // if we want to keep at least 1000 entries, then timestamps of
    // current through current-1000 are guaranteed not to be the oldest (but that does
    // not mean there are 1000 entries in that group... it's acutally anywhere between
    // 1 and 1000).
    // Also, if we want to remove 500 entries, then
    // oldestEntry through oldestEntry+500 are guaranteed to be
    // removed (however many there are there).
{code}

But really, it seems like you should disregard all the algorithmic stuff in LRU when implementing LFU.
If you think you see a bug in the existing LRU stuff, you're going to have to spell it out for me a bit more.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172605#comment-13172605 ] 

Yonik Seeley commented on SOLR-2906:
------------------------------------

Actually, I was thinking count >>= 1 (divide by 2).
We could make it optional, but my gut feeling says it should be on by default.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment: SOLR-2906.patch

I figured out what I did wrong.  You have to 'svn add' the files before you can 'svn diff' :)

                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155280#comment-13155280 ] 

Shawn Heisey edited comment on SOLR-2906 at 11/22/11 5:32 PM:
--------------------------------------------------------------

The only static warming I have is an all docs search with a sort parameter (and no filter query), which precaches my sort.  I do have autowarming configured, but this behavior happens from initial solr startup.  Things seem to behave correctly with warming - size is autowarmCount, inserts are zero.

FastLRUCache behavior:
queryResultCache: size is one higher than inserts
documentCache: size is one higher than inserts
filterCache: size and inserts are identical

LFUCache behavior:
queryResultCache: size is one higher than inserts
documentCache: size is one higher than inserts
filterCache: inserts is higher than size by a variable amount

I've seen 10 (on two different runs) and 2 (on the most recent run) as the difference between inserts and size.

The FastLRUCache behavior is seen on my production servers with production queries, the LFUCache behavior is on my test server with a benchmark script providing the queries.  I suppose there might be something weird about my canned queries that makes the filterCache behave differently, but the original source of the queries was a production Solr log at level INFO.

                
      was (Author: elyograg):
    The only static warming I have is a *:* search with a sort parameter (and no filter query), which precaches my sort.  I do have autowarming configured, but this behavior happens from initial solr startup.  Things seem to behave correctly with warming - size is autowarmCount, inserts are zero.

FastLRUCache behavior:
queryResultCache: size is one higher than inserts
documentCache: size is one higher than inserts
filterCache: size and inserts are identical

LFUCache behavior:
queryResultCache: size is one higher than inserts
documentCache: size is one higher than inserts
filterCache: inserts is higher than size by a variable amount

I've seen 10 (on two different runs) and 2 (on the most recent run) as the difference between inserts and size.

The FastLRUCache behavior is seen on my production servers with production queries, the LFUCache behavior is on my test server with a benchmark script providing the queries.  I suppose there might be something weird about my canned queries that makes the filterCache behave differently, but the original source of the queries was a production Solr log at level INFO.

                  
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153306#comment-13153306 ] 

Yonik Seeley commented on SOLR-2906:
------------------------------------

bq. since wantToKeep and wantToRemove are entry counts

Yes, it seems like mixing units.  But the key to understanding it is that lastAccessed isn't a real timestamp, but just a counter that is incremented for every access.  This means that if the latest "timestamp" is 5000, and we know we want to keep at least 1000 entries, then if we run across any timestamps greater than 4000 that it's guaranteed to be in the top 1000 entries and we don't need to consider it further.  One can just reverse that logic to figure out that an entry is definitely in the bottom group and we should immediately discard it.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-2906:
-------------------------------

    Attachment:     (was: ConcurrentLFUCache.java)
    
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, FastLFUCache.java, LFUCache.java, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2906) Implement LFU Cache

Posted by "Shawn Heisey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156273#comment-13156273 ] 

Shawn Heisey commented on SOLR-2906:
------------------------------------

Possibly false alarm.  Although I still do not know what causes the discrepancy between inserts and size on the filter cache, I can confirm that exactly the same thing happens when I change it to FastLRUCache, restart Solr, and fire up the benchmarking script.
                
> Implement LFU Cache
> -------------------
>
>                 Key: SOLR-2906
>                 URL: https://issues.apache.org/jira/browse/SOLR-2906
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 3.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: ConcurrentLFUCache.java, LFUCache.java, SOLR-2906.patch, TestLFUCache.java
>
>
> Implement an LFU (Least Frequently Used) cache as the first step towards a full ARC cache

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org