You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-dev@lucene.apache.org by Fuad Efendi <fu...@efendi.ca> on 2008/04/02 03:58:16 UTC

LRUCache - synchronized!?

Can we have anything better? I can't use 4-CPUs :(
Thanks!

P.S.
Blocked on lock
LRUCache.java:128

V.555343 7/11/07

RE: LRUCache - synchronized!?

Posted by Bambarbia <ba...@kirkudu.com>.

I didn't want to offend anyone...
Thanks


Funtick wrote:
> 
> Hello:
> 
> 
> I didn't want to offend anyone in this mailing list by posting this
> message, but I simple can't publish new messages since last post. Is there
> any kind of filtering/moderation?
> 
> LRUCache-related post which never reach solr-dev list:
> http://www.nabble.com/Server-Hungs-with-Lucene-2.3.1-td16700425.html
> 
> - I never had OutOfMemoryError before, when I used nightly cache
> refreshes; and I have it now almost each day with static index (without
> rewarming cache). Caches are properly configured, and memory is enough:
> 8192Mb allocated to Tomcat,16Gb RAM. That was the only difference which
> caused OutOfMemoryError: I temporary disabled SOLR replication.
> 
> I sent a message so solr-dev-owner, I can't post new subjects, but I can
> still reply.
> 
> Thanks,
> Fuad
> 
> 
> 
> Funtick wrote:
>> 
>> And having code like
>> Object get(Object key) {
>> 	synchronized (map) {
>> 		... <atomic>.incrementAndGet()
>> 		...
>> 	}
>> forces me to do more "sanity checks"...
> 
> 

-- 
View this message in context: http://www.nabble.com/LRUCache---synchronized%21--tp16439831p16963769.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

RE: LRUCache - synchronized!?

Posted by Funtick <fu...@efendi.ca>.

Hello:


I didn't want to offend anyone in this mailing list by posting this message,
but I simple can't publish new messages since last post. Is there any kind
of filtering/moderation?

LRUCache-related post which never reach solr-dev list:
http://www.nabble.com/Server-Hungs-with-Lucene-2.3.1-td16700425.html

- I never had OutOfMemoryError before, when I used nightly cache refreshes;
and I have it now almost each day with static index (without rewarming
cache). Caches are properly configured, and memory is enough: 8192Mb
allocated to Tomcat,16Gb RAM. That was the only difference which caused
OutOfMemoryError: I temporary disabled SOLR replication.

I sent a message so solr-dev-owner, I can't post new subjects, but I can
still reply.

Thanks,
Fuad



Funtick wrote:
> 
> And having code like
> Object get(Object key) {
> 	synchronized (map) {
> 		... <atomic>.incrementAndGet()
> 		...
> 	}
> forces me to do more "sanity checks"...

-- 
View this message in context: http://www.nabble.com/LRUCache---synchronized%21--tp16439831p16963558.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

Re: LRUCache - synchronized!?

Posted by Chris Hostetter <ho...@fucit.org>.

: Is there anywhere we can make a note of this so when we do go to 1.5 it gets
: put in the code?

you're confusing the Lucene compatibility requirements with Solr -- Solr 
has always required 1.5, so if anyone wants to contribute a 
ConcurrentHashMap based Cache (with some benchmarks demonstrating the 
when/what the advantages are) we can start using it.


-Hoss

Re: LRUCache - synchronized!?

Posted by Yonik Seeley <yo...@apache.org>.

On Fri, Apr 18, 2008 at 1:56 AM, Ian Holsman <li...@holsman.net> wrote:
>  but there is a ConcurrentSkipListMap in 1.6, and with jsr166x.jar (that
> contains the additions to the concurrent package to make it similar to
> 1.6's) it can be used in 1.5

Higher overhead wouldn't be worth it.

Synchronization on the cache is just around a get/put.  It's extremely
unlikely for that to be any kind of bottleneck, because the accessing
code has to do something with each lookup - and that is where the time
will be spent.

-Yonik

Re: LRUCache - synchronized!?

Posted by Ian Holsman <li...@holsman.net>.

Yonik Seeley wrote:
> On Thu, Apr 17, 2008 at 11:29 PM, Ian Holsman <li...@holsman.net> wrote:
>   
>> Is there anywhere we can make a note of this so when we do go to 1.5 it gets
>> put in the code?
>>     
>
> What, ConcurrentHashMap?
> I briefly considered it when I threw the caching stuff together... but
> the key here is that it's an LRUCache using LinkedHashMap, and there
> is no ConcurrentLinkedHashMap.
>
>   
but there is a ConcurrentSkipListMap in 1.6, and with jsr166x.jar (that 
contains the additions to the concurrent package to make it similar to 
1.6's) it can be used in 1.5

> Oh, and Solr has always required 1.5 anyway.
>
> -Yonik
>
>

RE: LRUCache - synchronized!?

Posted by Fuad Efendi <fu...@efendi.ca>.

> What, ConcurrentHashMap?
> I briefly considered it when I threw the caching stuff together... but
> the key here is that it's an LRUCache using LinkedHashMap, and there
> is no ConcurrentLinkedHashMap.
> -Yonik
> 


There is a lot more... Ask Doug Lea, only small piece of his work is part of
Java 5/6, after 12 years of PoC...
As a sample, EhCache has different builds depending on Java version...


P.S. 
I am unable to post message regarding OutOfMemoryError (-Xmx8192M -Xms8192M,
RAM 16Gb) (something cache related, probably smth 'unsynchronized'; happens
sometimes, during high loads)

Re: LRUCache - synchronized!?

Posted by Ian Holsman <li...@holsman.net>.

Chris Hostetter wrote:
> : I briefly considered it when I threw the caching stuff together... but
> : the key here is that it's an LRUCache using LinkedHashMap, and there
> : is no ConcurrentLinkedHashMap.
>
> But we could have an alternate ConcurrentHashMap based SolrCache that 
> isn't LRU for people who plan on sizing their Caches big enough that they 
> don't care aboutthe replacement strategy (random replacement could be 
> "good enough")
>
> Random thought: could we do better using a combination of 
> a ConcurrentLinkedQueue for keys and a WeakHashRef for the key=>value 
> pairs?  would that even work?  (i'm not sure what the semantics of a Queue 
> are when the item is already in the queue ... i'm probably smoking crack)
>
>   
If you can stand  O(logN) you could use a priority queue/skip list that 
has a concurrent implementation (albeit I couldn't find a ASL friendly 
java implementation of them, so it means someone has to do some work)

it's a bit late for a SoC project as well ;(

>
> -Hoss
>
>
>

Re: LRUCache - synchronized!?

Posted by Mike Klaas <mi...@gmail.com>.

On 17-Apr-08, at 9:03 PM, Chris Hostetter wrote:
>
> : I briefly considered it when I threw the caching stuff together...  
> but
> : the key here is that it's an LRUCache using LinkedHashMap, and there
> : is no ConcurrentLinkedHashMap.
>
> But we could have an alternate ConcurrentHashMap based SolrCache that
> isn't LRU for people who plan on sizing their Caches big enough that  
> they
> don't care aboutthe replacement strategy (random replacement could be
> "good enough")
>
> Random thought: could we do better using a combination of
> a ConcurrentLinkedQueue for keys and a WeakHashRef for the key=>value
> pairs?  would that even work?  (i'm not sure what the semantics of a  
> Queue
> are when the item is already in the queue ... i'm probably smoking  
> crack)

Well, I should have thought of this when replying to Fuad to begin  
with, but the single-cpu pegging behaviour of faceting is expected  
when looking at a single request (all computation is occurring in one  
thread).

If this is indeed due to multiple requests, and synchro is really  
taking up more time that bitset intersections, then it is highly  
likely that the nature of the faceting problem would benefit from a  
different approach ((multivalued) FieldCache, perhaps )

-Mike

Re: LRUCache - synchronized!?

Posted by Chris Hostetter <ho...@fucit.org>.

: I briefly considered it when I threw the caching stuff together... but
: the key here is that it's an LRUCache using LinkedHashMap, and there
: is no ConcurrentLinkedHashMap.

But we could have an alternate ConcurrentHashMap based SolrCache that 
isn't LRU for people who plan on sizing their Caches big enough that they 
don't care aboutthe replacement strategy (random replacement could be 
"good enough")

Random thought: could we do better using a combination of 
a ConcurrentLinkedQueue for keys and a WeakHashRef for the key=>value 
pairs?  would that even work?  (i'm not sure what the semantics of a Queue 
are when the item is already in the queue ... i'm probably smoking crack)



-Hoss

Re: LRUCache - synchronized!?

Posted by Yonik Seeley <yo...@apache.org>.

On Thu, Apr 17, 2008 at 11:29 PM, Ian Holsman <li...@holsman.net> wrote:
> Is there anywhere we can make a note of this so when we do go to 1.5 it gets
> put in the code?

What, ConcurrentHashMap?
I briefly considered it when I threw the caching stuff together... but
the key here is that it's an LRUCache using LinkedHashMap, and there
is no ConcurrentLinkedHashMap.

Oh, and Solr has always required 1.5 anyway.

-Yonik

Re: LRUCache - synchronized!?

Posted by Ian Holsman <li...@holsman.net>.

Is there anywhere we can make a note of this so when we do go to 1.5 it 
gets put in the code?

and possibly on the SolrPerformance wiki page so that 1.5 users can get 
the performance boost I'm expecting this kind of thing would give.

are there any disadvantages of using ConcurrentHashMap that a java n00b 
like me should be aware of? anyone else using this in production?

and also any other things like this in the code that could be "instant 
wins" ?

Fuad Efendi wrote:
> :)
>   
>> Have you tried using a ConcurrentHashMap? 
>>     
> - Of course. 
>
> And having code like
> Object get(Object key) {
> 	synchronized (map) {
> 		... <atomic>.incrementAndGet()
> 		...
> 	}
> forces me to do more "sanity checks"... Now I understand why SOLR uses
> single overloaded CPU in 4-CPU system when faceting is enabled.
>
> Even get() method is synchronized...
>
> I need to learn how to use plugin jars, don't want to alter source...
> Thanks!
>
>
>   
>> On 1-Apr-08, at 6:58 PM, Fuad Efendi wrote:
>>     
>>> Can we have anything better? I can't use 4-CPUs :(
>>> Thanks!
>>>       
>> You can have anything your heart desires... Solr is open-source :)
>>
>> Have you tried using a ConcurrentHashMap?
>>
>> -Mike
>>
>>
>>     
>
>
>

RE: LRUCache - synchronized!?

Posted by Fuad Efendi <fu...@efendi.ca>.

:)
> Have you tried using a ConcurrentHashMap? 
- Of course. 

And having code like
Object get(Object key) {
	synchronized (map) {
		... <atomic>.incrementAndGet()
		...
	}
forces me to do more "sanity checks"... Now I understand why SOLR uses
single overloaded CPU in 4-CPU system when faceting is enabled.

Even get() method is synchronized...

I need to learn how to use plugin jars, don't want to alter source...
Thanks!


> On 1-Apr-08, at 6:58 PM, Fuad Efendi wrote:
> > Can we have anything better? I can't use 4-CPUs :(
> > Thanks!
> 
> You can have anything your heart desires... Solr is open-source :)
> 
> Have you tried using a ConcurrentHashMap?
> 
> -Mike
> 
>

Re: LRUCache - synchronized!?

Posted by Mike Klaas <mi...@gmail.com>.

On 1-Apr-08, at 6:58 PM, Fuad Efendi wrote:
> Can we have anything better? I can't use 4-CPUs :(
> Thanks!

You can have anything your heart desires... Solr is open-source :)

Have you tried using a ConcurrentHashMap?

-Mike

Re: LRUCache - synchronized!?

Posted by Yonik Seeley <yo...@apache.org>.

On Tue, Apr 29, 2008 at 2:37 PM, Funtick <fu...@efendi.ca> wrote:
>
>  Thanks Otis;
>
>
>  > > Object get(Object key) {
>  > >     synchronized (map) {
>  > >         ... .incrementAndGet()
>  > >         ...
>  > >     }
>
>  Existing code does not slow down performance in simplest cases. However, it
>  slows down with read-only Faceted queries because each such query hits cache
>  thousands times even in simplest scenario.
>
>  This is ACCESS-ORDERED (LRU implementation) LinkedHashMap map = new
>  LinkedHashMap(initialSize, 0.75f, true)
>  - so that even GET is a structural modification (changind an order of a
>  list!), it should be synchronized... even if it is not, "changing order"
>  costs some CPU time.
>
>  Simplest way to improve: go with INSERTION-ORDERED (FIFO implementation)
>  linked hash map, and unsynchronize GET(); acceptable for many cases.

Why don't you try this and report the results.
I still think that the bottleneck is likely to be something else
(after all, you have to do quite a bit of work for every item
retrieved from the cache or inserted into the cache).

>  I also don't understand why regenerateItem() is not synchronized (warm()
>  method in LRUCache should be synchronized on _local_ map).

After a quick re-inspection of this code, it looks fine to me. If you
see something that isn't thread safe, please let us know.

-Yonik

Re: LRUCache - synchronized!?

Posted by Funtick <fu...@efendi.ca>.

Thanks Otis;

> > Object get(Object key) {
> >     synchronized (map) {
> >         ... .incrementAndGet()
> >         ...
> >     }

Existing code does not slow down performance in simplest cases. However, it
slows down with read-only Faceted queries because each such query hits cache
thousands times even in simplest scenario.

This is ACCESS-ORDERED (LRU implementation) LinkedHashMap map = new
LinkedHashMap(initialSize, 0.75f, true)
- so that even GET is a structural modification (changind an order of a
list!), it should be synchronized... even if it is not, "changing order"
costs some CPU time. 

Simplest way to improve: go with INSERTION-ORDERED (FIFO implementation)
linked hash map, and unsynchronize GET(); acceptable for many cases.

I also don't understand why regenerateItem() is not synchronized (warm()
method in LRUCache should be synchronized on _local_ map).


P.S.
Most probably we can safely remove 'synchronized' from GET() in
Access-Ordered LRU, accepting fact that some entries could be wrongly
removed from LRU cache and we are not iterating the map... Same with PUT(),
no any risk with size() and removeEldestEntry() (I hope, no any OOM)...

EHCache also bases their LRU on LinkedHashMap:
  public final class SpoolingLinkedHashMap extends java.util.LinkedHashMap
Their net.sf.ehcache.Cache class has internally synchronized get() and put()
methods.


P.P.S.
What  about this double-synchronization:
public synchronized Object put(Object key, Object value) {
  if (state == State.LIVE) {
    stats.inserts.incrementAndGet();
  }
  synchronized (map) {
    // increment local inserts regardless of state???
    // it does make it more consistent with the current size...
    inserts++;
    return map.put(key,value);
  }
}
On Windows/Intel, single-threaded 'synchronized' statement requires
additionally 600 CPU cycles...

-Fuad
-- 
View this message in context: http://www.nabble.com/LRUCache---synchronized%21--tp16439831p16967253.html
Sent from the Solr - Dev mailing list archive at Nabble.com.