You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Mirko Sertic <mi...@web.de> on 2013/08/20 19:53:27 UTC

Optimize Lucene 4.4 for CPU usage

Hi there

I am using Lucene 4.4, and i am hitting cpu usage limitations on my core 
i7 windows 7 64bit box. Seems like the IO system(ssd) has still 
capacity, but when running 8 threads searching on the index in parallel, 
all logical cpu cores are at 100% usage.

Is there a common way available to optimize query throughput and lower 
cpu usage? I am thinking index compression could be disabled for 
instance, as index size is not the problem.

Any ideas?

Thanks in advance
Mirko

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Optimize Lucene 4.4 for CPU usage

Posted by Adrien Grand <jp...@gmail.com>.
Hi,

On Sat, Aug 31, 2013 at 6:55 AM, Rose, Stuart J <st...@pnnl.gov> wrote:
> I've noticed that processes that were previously IO bound (in 3.5) are now CPU bound (in 4.4) and I expect it is due to the compression/decompression of term vector fields  in 4.4.
>
> It would be nice if users of 4.4 could turn the compression OFF entirely.

Even though the default Lucene codec just tries to make good
trade-offs regarding I/O vs. CPU usage for most use-cases, it is
possible that it is not optimal for your use-case. If this is a
problem, it is possible to change the trade-offs by writing a custom
codec.

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Optimize Lucene 4.4 for CPU usage

Posted by "Rose, Stuart J" <st...@pnnl.gov>.
I've noticed that processes that were previously IO bound (in 3.5) are now CPU bound (in 4.4) and I expect it is due to the compression/decompression of term vector fields  in 4.4. 

It would be nice if users of 4.4 could turn the compression OFF entirely. 



-----Original Message-----
From: Ivan Krišto [mailto:ivan.kristo@gmail.com] 
Sent: Wednesday, August 21, 2013 12:45 PM
To: java-user@lucene.apache.org
Subject: Re: Optimize Lucene 4.4 for CPU usage

On 08/20/2013 07:53 PM, Mirko Sertic wrote:
> I am using Lucene 4.4, and i am hitting cpu usage limitations on my 
> core i7 windows 7 64bit box. Seems like the IO system(ssd) has still 
> capacity, but when running 8 threads searching on the index in 
> parallel, all logical cpu cores are at 100% usage.
>
> Is there a common way available to optimize query throughput and lower 
> cpu usage? I am thinking index compression could be disabled for 
> instance, as index size is not the problem.

Have you tried to profile code to rule out options?
Plain JVisualVM (free Java profiler that comes with JDK) should do the trick. Just run profiler against Lucene and check which methods take most of CPU time. Maybe some serialization outside lucene takes most of the CPU time.


  Regards,
    Ivan Krišto

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Optimize Lucene 4.4 for CPU usage

Posted by Ivan Krišto <iv...@gmail.com>.
On 08/20/2013 07:53 PM, Mirko Sertic wrote:
> I am using Lucene 4.4, and i am hitting cpu usage limitations on my
> core i7 windows 7 64bit box. Seems like the IO system(ssd) has still
> capacity, but when running 8 threads searching on the index in
> parallel, all logical cpu cores are at 100% usage.
>
> Is there a common way available to optimize query throughput and lower
> cpu usage? I am thinking index compression could be disabled for
> instance, as index size is not the problem.

Have you tried to profile code to rule out options?
Plain JVisualVM (free Java profiler that comes with JDK) should do the
trick. Just run profiler against Lucene and check which methods take
most of CPU time. Maybe some serialization outside lucene takes most of
the CPU time.


  Regards,
    Ivan Krišto

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org