You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Marcus Herou <ma...@tailsweep.com> on 2008/01/22 06:44:01 UTC

OOE during indexing

Hi.

I get OOE with Solr 1.3 Autowarm seem to be the villain in cojunction with
FieldCache somehow.
JVM args: -Xmx512m -Xms512m -Xss128k

Index size is ~4 Million docs, where I index text and store database primary
keys.
du /srv/solr/feedItem/data/index/
1.7G    /srv/solr/feedItem/data/index/

To ensure that the docs I index do not swell to much I only allow 5K per doc
to over the wire i.e. I substring 0, 5000 on the field "content"

I have removed "firstSearcher" and "newSearcher" since the queries I used
before killed performance on reindexing the whole index. I will add them
later again when I get into a delta update index state.

Stacktrace.
[06:25:53.122] [null] /update wt=xml&version=2.2 0 3165
[06:25:53.877] Error during auto-warming of key:
org.apache.solr.search.QueryResultKey@19695d2d:java.lang.OutOfMemoryError:
Java heap space
[06:25:53.877]  at org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java
:104)
[06:25:53.877]  at org.apache.lucene.index.SegmentTermEnum.term(
SegmentTermEnum.java:159)
[06:25:53.877]  at org.apache.lucene.index.SegmentMergeInfo.next(
SegmentMergeInfo.java:66)
[06:25:53.877]  at org.apache.lucene.index.MultiTermEnum.next(
MultiReader.java:315)
[06:25:53.877]  at org.apache.lucene.search.FieldCacheImpl$10.createValue(
FieldCacheImpl.java:388)
[06:25:53.877]  at org.apache.lucene.search.FieldCacheImpl$Cache.get(
FieldCacheImpl.java:72)
[06:25:53.877]  at org.apache.lucene.search.FieldCacheImpl.getStringIndex(
FieldCacheImpl.java:350)
[06:25:53.877]  at
org.apache.lucene.search.FieldSortedHitQueue.comparatorString(
FieldSortedHitQueue.java:266)
[06:25:53.877]  at
org.apache.lucene.search.FieldSortedHitQueue$1.createValue(
FieldSortedHitQueue.java:182)
[06:25:53.877]  at org.apache.lucene.search.FieldCacheImpl$Cache.get(
FieldCacheImpl.java:72)
[06:25:53.877]  at
org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(
FieldSortedHitQueue.java:155)
[06:25:53.877]  at org.apache.lucene.search.FieldSortedHitQueue.<init>(
FieldSortedHitQueue.java:56)
[06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher.getDocListNC(
SolrIndexSearcher.java:862)
[06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher.getDocListC(
SolrIndexSearcher.java:808)
[06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher.access$000(
SolrIndexSearcher.java:56)
[06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher$2.regenerateItem
(SolrIndexSearcher.java:254)
[06:25:53.877]  at org.apache.solr.search.LRUCache.warm(LRUCache.java:192)
[06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher.warm(
SolrIndexSearcher.java:1393)
[06:25:53.877]  at org.apache.solr.core.SolrCore$2.call(SolrCore.java:702)
[06:25:53.877]  at java.util.concurrent.FutureTask$Sync.innerRun(
FutureTask.java:269)
[06:25:53.877]  at java.util.concurrent.FutureTask.run(FutureTask.java:123)
[06:25:53.877]  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
ThreadPoolExecutor.java:650)
[06:25:53.877]  at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:675)
[06:25:53.877]  at java.lang.Thread.run(Thread.java:595)

Help anyone?

Attaching schema.xml and solrconfig.xml

Kindly

//Marcus Herou

Re: OOE during indexing

Posted by Marcus Herou <ma...@tailsweep.com>.
Thanks for your reply.

I set autowarmcount = 0 for both LRUCache and the queryCache but still I got
these errors on heavy reindexing (4M docs as fast as possible each doc <
10K). I removed firstSearcher and newSearcher but I still got the same
errors.

The strange thing is that now when the server have returned to delta
indexing state I can survive with 256M memory where I before died on 512M.

So the conclusion is perhaps if I reduce the number of documents indexed per
second ( by introducing Thread.sleep(X) ) then I will survive with lower
heap ? Seems like wrong assumption but...

Do lucene never use disksorting ? I mean something like this:

if(estimateRamUsed(resultset)) > maxRAMPerThread)
{
  doDiskSort(resultset)
}
else....














On 1/22/08, Mike Klaas <mi...@gmail.com> wrote:
>
> Queries involving sorting can occupy a lot of memory.  During
> autowarming you need 2x peak memory usage.  The only thing you can do
> is increase your max heap size or be careful about cache autowarming
> (possibly turning it off).
>
> cheers,
> -Mike
>
> On 21-Jan-08, at 9:44 PM, Marcus Herou wrote:
>
> > Hi.
> >
> > I get OOE with Solr 1.3 Autowarm seem to be the villain in
> > cojunction with FieldCache somehow.
> > JVM args: -Xmx512m -Xms512m -Xss128k
> >
> > Index size is ~4 Million docs, where I index text and store
> > database primary keys.
> > du /srv/solr/feedItem/data/index/
> > 1.7G    /srv/solr/feedItem/data/index/
> >
> > To ensure that the docs I index do not swell to much I only allow
> > 5K per doc to over the wire i.e. I substring 0, 5000 on the field
> > "content"
> >
> > I have removed "firstSearcher" and "newSearcher" since the queries
> > I used before killed performance on reindexing the whole index. I
> > will add them later again when I get into a delta update index state.
> >
> > Stacktrace.
> > [06:25:53.122] [null] /update wt=xml&version=2.2 0 3165
> > [06:25:53.877] Error during auto-warming of
> > key:org.apache.solr.search.QueryResultKey@19695d2d:java.lang.OutOfMemo
> > ryError: Java heap space
> > [06:25:53.877]  at org.apache.lucene.index.TermBuffer.toTerm
> > (TermBuffer.java:104)
> > [06:25:53.877]  at org.apache.lucene.index.SegmentTermEnum.term
> > (SegmentTermEnum.java:159)
> > [06:25:53.877]  at org.apache.lucene.index.SegmentMergeInfo.next
> > (SegmentMergeInfo.java:66)
> > [06:25:53.877]  at org.apache.lucene.index.MultiTermEnum.next
> > (MultiReader.java:315)
> > [06:25:53.877]  at org.apache.lucene.search.FieldCacheImpl
> > $10.createValue(FieldCacheImpl.java:388)
> > [06:25: 53.877]  at org.apache.lucene.search.FieldCacheImpl
> > $Cache.get(FieldCacheImpl.java:72)
> > [06:25:53.877]  at
> > org.apache.lucene.search.FieldCacheImpl.getStringIndex
> > (FieldCacheImpl.java:350)
> > [06:25:53.877]  at
> > org.apache.lucene.search.FieldSortedHitQueue.comparatorString
> > (FieldSortedHitQueue.java:266)
> > [06:25:53.877]  at org.apache.lucene.search.FieldSortedHitQueue
> > $1.createValue(FieldSortedHitQueue.java:182)
> > [06:25:53.877]  at org.apache.lucene.search.FieldCacheImpl$Cache.get
> > (FieldCacheImpl.java :72)
> > [06:25:53.877]  at
> > org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator
> > (FieldSortedHitQueue.java:155)
> > [06:25:53.877]  at
> > org.apache.lucene.search.FieldSortedHitQueue.<init>
> > (FieldSortedHitQueue.java :56)
> > [06:25:53.877]  at
> > org.apache.solr.search.SolrIndexSearcher.getDocListNC
> > (SolrIndexSearcher.java:862)
> > [06:25:53.877]  at
> > org.apache.solr.search.SolrIndexSearcher.getDocListC
> > (SolrIndexSearcher.java:808)
> > [06:25: 53.877]  at org.apache.solr.search.SolrIndexSearcher.access
> > $000(SolrIndexSearcher.java:56)
> > [06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher
> > $2.regenerateItem(SolrIndexSearcher.java:254)
> > [06:25:53.877]  at org.apache.solr.search.LRUCache.warm
> > (LRUCache.java:192)
> > [06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher.warm
> > (SolrIndexSearcher.java:1393)
> > [06:25:53.877]  at org.apache.solr.core.SolrCore$2.call
> > (SolrCore.java :702)
> > [06:25:53.877]  at java.util.concurrent.FutureTask$Sync.innerRun
> > (FutureTask.java:269)
> > [06:25:53.877]  at java.util.concurrent.FutureTask.run
> > (FutureTask.java:123)
> > [06:25:53.877]  at java.util.concurrent.ThreadPoolExecutor
> > $Worker.runTask (ThreadPoolExecutor.java:650)
> > [06:25:53.877]  at java.util.concurrent.ThreadPoolExecutor
> > $Worker.run(ThreadPoolExecutor.java:675)
> > [06:25:53.877]  at java.lang.Thread.run(Thread.java:595)
> >
> > Help anyone?
> >
> > Attaching schema.xml and solrconfig.xml
> >
> > Kindly
> >
> > //Marcus Herou
> >
> > <schema.xml><solrconfig.xml>
>
>


-- 
Marcus Herou Solution Architect and Java developer Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/

Re: OOE during indexing

Posted by Mike Klaas <mi...@gmail.com>.
Queries involving sorting can occupy a lot of memory.  During  
autowarming you need 2x peak memory usage.  The only thing you can do  
is increase your max heap size or be careful about cache autowarming  
(possibly turning it off).

cheers,
-Mike

On 21-Jan-08, at 9:44 PM, Marcus Herou wrote:

> Hi.
>
> I get OOE with Solr 1.3 Autowarm seem to be the villain in  
> cojunction with FieldCache somehow.
> JVM args: -Xmx512m -Xms512m -Xss128k
>
> Index size is ~4 Million docs, where I index text and store  
> database primary keys.
> du /srv/solr/feedItem/data/index/
> 1.7G    /srv/solr/feedItem/data/index/
>
> To ensure that the docs I index do not swell to much I only allow  
> 5K per doc to over the wire i.e. I substring 0, 5000 on the field  
> "content"
>
> I have removed "firstSearcher" and "newSearcher" since the queries  
> I used before killed performance on reindexing the whole index. I  
> will add them later again when I get into a delta update index state.
>
> Stacktrace.
> [06:25:53.122] [null] /update wt=xml&version=2.2 0 3165
> [06:25:53.877] Error during auto-warming of  
> key:org.apache.solr.search.QueryResultKey@19695d2d:java.lang.OutOfMemo 
> ryError: Java heap space
> [06:25:53.877]  at org.apache.lucene.index.TermBuffer.toTerm 
> (TermBuffer.java:104)
> [06:25:53.877]  at org.apache.lucene.index.SegmentTermEnum.term 
> (SegmentTermEnum.java:159)
> [06:25:53.877]  at org.apache.lucene.index.SegmentMergeInfo.next  
> (SegmentMergeInfo.java:66)
> [06:25:53.877]  at org.apache.lucene.index.MultiTermEnum.next 
> (MultiReader.java:315)
> [06:25:53.877]  at org.apache.lucene.search.FieldCacheImpl 
> $10.createValue(FieldCacheImpl.java:388)
> [06:25: 53.877]  at org.apache.lucene.search.FieldCacheImpl 
> $Cache.get(FieldCacheImpl.java:72)
> [06:25:53.877]  at  
> org.apache.lucene.search.FieldCacheImpl.getStringIndex 
> (FieldCacheImpl.java:350)
> [06:25:53.877]  at  
> org.apache.lucene.search.FieldSortedHitQueue.comparatorString  
> (FieldSortedHitQueue.java:266)
> [06:25:53.877]  at org.apache.lucene.search.FieldSortedHitQueue 
> $1.createValue(FieldSortedHitQueue.java:182)
> [06:25:53.877]  at org.apache.lucene.search.FieldCacheImpl$Cache.get 
> (FieldCacheImpl.java :72)
> [06:25:53.877]  at  
> org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator 
> (FieldSortedHitQueue.java:155)
> [06:25:53.877]  at  
> org.apache.lucene.search.FieldSortedHitQueue.<init> 
> (FieldSortedHitQueue.java :56)
> [06:25:53.877]  at  
> org.apache.solr.search.SolrIndexSearcher.getDocListNC 
> (SolrIndexSearcher.java:862)
> [06:25:53.877]  at  
> org.apache.solr.search.SolrIndexSearcher.getDocListC 
> (SolrIndexSearcher.java:808)
> [06:25: 53.877]  at org.apache.solr.search.SolrIndexSearcher.access 
> $000(SolrIndexSearcher.java:56)
> [06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher 
> $2.regenerateItem(SolrIndexSearcher.java:254)
> [06:25:53.877]  at org.apache.solr.search.LRUCache.warm 
> (LRUCache.java:192)
> [06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher.warm 
> (SolrIndexSearcher.java:1393)
> [06:25:53.877]  at org.apache.solr.core.SolrCore$2.call 
> (SolrCore.java :702)
> [06:25:53.877]  at java.util.concurrent.FutureTask$Sync.innerRun 
> (FutureTask.java:269)
> [06:25:53.877]  at java.util.concurrent.FutureTask.run 
> (FutureTask.java:123)
> [06:25:53.877]  at java.util.concurrent.ThreadPoolExecutor 
> $Worker.runTask (ThreadPoolExecutor.java:650)
> [06:25:53.877]  at java.util.concurrent.ThreadPoolExecutor 
> $Worker.run(ThreadPoolExecutor.java:675)
> [06:25:53.877]  at java.lang.Thread.run(Thread.java:595)
>
> Help anyone?
>
> Attaching schema.xml and solrconfig.xml
>
> Kindly
>
> //Marcus Herou
>
> <schema.xml><solrconfig.xml>


Re: OOE during indexing

Posted by Marcus Herou <ma...@tailsweep.com>.
Yep but we hire these god damn boxes and then my friend memory costs per
month = not cheap in long term. Something like 50$ / month for 2G more.

I might be an ultra geek when it comes to Linux and programming but I'm not
an ultra-geek building servers from scratch. But I will straighten up and
buy me a box something like this:
6-8G RAM
2 quad core ~ 2.5 MHz
4 146G scsi - RAID10'd disks

to be used as database server and I will put everything together. This will
be a good exercise for replacing all our hired servers to bought ones.

Kindly

//Marcus




On 1/23/08, Mike Klaas <mi...@gmail.com> wrote:
>
> On 22-Jan-08, at 9:46 PM, Marcus Herou wrote:
> >
> > OK I got the conclusion myself. add memory to the box and get some
> > more
> > boxes :)
>
> I'm glad you've come to that conclusion, but to reinforce it: Solr/
> Lucene heavily benefits from loads of memory.  Not just for Solr
> caching, but it also depends on the OS being able to load the hot
> spots of the index into the disk cache.  And with memory being as
> cheap as it is...
>
> -Mike
>



-- 
Marcus Herou Solution Architect and Java developer Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/

Re: OOE during indexing

Posted by Mike Klaas <mi...@gmail.com>.
On 22-Jan-08, at 9:46 PM, Marcus Herou wrote:
>
> OK I got the conclusion myself. add memory to the box and get some  
> more
> boxes :)

I'm glad you've come to that conclusion, but to reinforce it: Solr/ 
Lucene heavily benefits from loads of memory.  Not just for Solr  
caching, but it also depends on the OS being able to load the hot  
spots of the index into the disk cache.  And with memory being as  
cheap as it is...

-Mike

Re: OOE during indexing

Posted by Marcus Herou <ma...@tailsweep.com>.
Thanks!

Yes I agree (to a certain level) on me being naive. Currently I'm only using
one server for this but will go into distributed snapshot/pull mode soon.
Then I can tune the slaves differently then the master I believe. The master
for instance do not need autowarming nor caches if not searched upon.

Since our company is quite new we cannot afford doing an architecture the
"right" way (yes I'm a errrrr system/infrastructure arch).  The box I'm
running on has 2G memory and 4 resin processes + a quartz scheduler running.
For instance on the same box (tailsweep3) a mysql (slave) and activemq
server sits as well, and yes a lighttpd server. All of these guys compete
for memory, cpu (idling) and i|o. So frankly I'm out of memory on the box
(or swapping as hell).

OK I got the conclusion myself. add memory to the box and get some more
boxes :)

Kindly

//Marcus





On 1/22/08, Chris Hostetter <ho...@fucit.org> wrote:
>
>
> : I get OOE with Solr 1.3 Autowarm seem to be the villain in cojunction
> with
> : FieldCache somehow.
> : JVM args: -Xmx512m -Xms512m -Xss128k
> :
> : Index size is ~4 Million docs, where I index text and store database
> primary
>
> it seems naive to me to only allow 512MB for an index of 4 million docs --
> no matter how small the docs are.
>
> : I have removed "firstSearcher" and "newSearcher" since the queries I
> used
> : before killed performance on reindexing the whole index. I will add them
> : later again when I get into a delta update index state.
>
> in that case you should change the autowarm counts of all your caches to 0
> as well ... that's what is causing your stack trace below: it's
> autowarming the queryResultCache which wants to use the FieldCache (either
> for sorting or faceting)
>
> Also note that you have maxWarmingSearchers set to 4 -- that's fine for a
> master that's being updated frequently -- as long as you have the caches
> disabled (or at least the autowarming of the caches disabled)
>
>
> : [06:25:53.877] Error during auto-warming of key:
> : org.apache.solr.search.QueryResultKey@19695d2d:
> java.lang.OutOfMemoryError:
> : Java heap space
>
> : [06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher.getDocListNC
> (
> : SolrIndexSearcher.java:862)
>
> : [06:25:53.877]  at org.apache.solr.search.LRUCache.warm(LRUCache.java
> :192)
> : [06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher.warm(
> : SolrIndexSearcher.java:1393)
>
>
>
> -Hoss
>
>


-- 
Marcus Herou Solution Architect and Java developer Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/

Re: OOE during indexing

Posted by Chris Hostetter <ho...@fucit.org>.
: I get OOE with Solr 1.3 Autowarm seem to be the villain in cojunction with
: FieldCache somehow.
: JVM args: -Xmx512m -Xms512m -Xss128k
: 
: Index size is ~4 Million docs, where I index text and store database primary

it seems naive to me to only allow 512MB for an index of 4 million docs -- 
no matter how small the docs are.

: I have removed "firstSearcher" and "newSearcher" since the queries I used
: before killed performance on reindexing the whole index. I will add them
: later again when I get into a delta update index state.

in that case you should change the autowarm counts of all your caches to 0 
as well ... that's what is causing your stack trace below: it's 
autowarming the queryResultCache which wants to use the FieldCache (either 
for sorting or faceting)

Also note that you have maxWarmingSearchers set to 4 -- that's fine for a 
master that's being updated frequently -- as long as you have the caches 
disabled (or at least the autowarming of the caches disabled)


: [06:25:53.877] Error during auto-warming of key:
: org.apache.solr.search.QueryResultKey@19695d2d:java.lang.OutOfMemoryError:
: Java heap space

: [06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher.getDocListNC(
: SolrIndexSearcher.java:862)

: [06:25:53.877]  at org.apache.solr.search.LRUCache.warm(LRUCache.java:192)
: [06:25:53.877]  at org.apache.solr.search.SolrIndexSearcher.warm(
: SolrIndexSearcher.java:1393)



-Hoss