You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Terry Steichen <te...@net-frame.com> on 2002/11/07 21:25:52 UTC

Mushrooming Index Files

I have a small (150) set of documents I'm doing some testing on.  After I index them, I have 46 files in the index directory. Then I find a subset, and for each I (a) remove it from the index, (b) edit it, and (c) add it back into the index.  The code works fine.

However, the number of files in the index directory is mushrooming.  Here's what happens:

Edit/add/delete 5 documents - index files grow from 46 to 91
Edit/add/delete 5 documents - index files grow from 91 to 133
Edit/add/delete 5 documents - index files grow from 133 to 178

Each time the number of index files appears to grow by about 45.  I then optimize and the optimize returns immediately (which, from my experience, suggests that optimization isn't needed - or perhaps isn't working?).

I eventually got to over 500 index files and ended up running out of file handles. 

Is this normal?

Regards,

Terry
    
PS: I'm using 1.2Rc5

PSS: When I reindex, it goes back to about 46 files.



Re: Mushrooming Index Files

Posted by Doug Cutting <cu...@lucene.com>.
My guess is that you have around 40 fields.  Each field requires a 
separate file in each segment.  Can you combine any of your fields?

Terry Steichen wrote:
> I need to modify my original issue below.  I was in error - the optimization
> does indeed bring the total number of index files back to 46.  But further
> experiments show that the total number of files seems to get even larger
> than I indicated below if I do even a modest level of removing and adding
> documents between optimizations.
> 
> ----- Original Message -----
> From: "Terry Steichen" <te...@net-frame.com>
> To: "Lucene Users Group" <lu...@jakarta.apache.org>
> Sent: Thursday, November 07, 2002 3:25 PM
> Subject: Mushrooming Index Files
> 
> 
> I have a small (150) set of documents I'm doing some testing on.  After I
> index them, I have 46 files in the index directory. Then I find a subset,
> and for each I (a) remove it from the index, (b) edit it, and (c) add it
> back into the index.  The code works fine.
> 
> However, the number of files in the index directory is mushrooming.  Here's
> what happens:
> 
> Edit/add/delete 5 documents - index files grow from 46 to 91
> Edit/add/delete 5 documents - index files grow from 91 to 133
> Edit/add/delete 5 documents - index files grow from 133 to 178
> 
> Each time the number of index files appears to grow by about 45.  I then
> optimize and the optimize returns immediately (which, from my experience,
> suggests that optimization isn't needed - or perhaps isn't working?).
> 
> I eventually got to over 500 index files and ended up running out of file
> handles.
> 
> Is this normal?
> 
> Regards,
> 
> Terry
> 
> PS: I'm using 1.2Rc5
> 
> PSS: When I reindex, it goes back to about 46 files.
> 
> 
> 
> 
> 
> --
> To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>
> 



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Mushrooming Index Files

Posted by Terry Steichen <te...@net-frame.com>.
I need to modify my original issue below.  I was in error - the optimization
does indeed bring the total number of index files back to 46.  But further
experiments show that the total number of files seems to get even larger
than I indicated below if I do even a modest level of removing and adding
documents between optimizations.

----- Original Message -----
From: "Terry Steichen" <te...@net-frame.com>
To: "Lucene Users Group" <lu...@jakarta.apache.org>
Sent: Thursday, November 07, 2002 3:25 PM
Subject: Mushrooming Index Files


I have a small (150) set of documents I'm doing some testing on.  After I
index them, I have 46 files in the index directory. Then I find a subset,
and for each I (a) remove it from the index, (b) edit it, and (c) add it
back into the index.  The code works fine.

However, the number of files in the index directory is mushrooming.  Here's
what happens:

Edit/add/delete 5 documents - index files grow from 46 to 91
Edit/add/delete 5 documents - index files grow from 91 to 133
Edit/add/delete 5 documents - index files grow from 133 to 178

Each time the number of index files appears to grow by about 45.  I then
optimize and the optimize returns immediately (which, from my experience,
suggests that optimization isn't needed - or perhaps isn't working?).

I eventually got to over 500 index files and ended up running out of file
handles.

Is this normal?

Regards,

Terry

PS: I'm using 1.2Rc5

PSS: When I reindex, it goes back to about 46 files.





--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>