You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Paul deGrandis <pa...@gmail.com> on 2008/10/16 21:28:29 UTC

Reduction of open files

I have been working with SOLR for a few months now.  According to some
documentation I read, segment files only have one set of all the other
lingustic module type of stuff (normalization, frequency), is there a
way to remove/reduce the files not associated with a segment besides
optimizing the index?

I set my mergeFactor to 2 for sake of trying to tease out a solution.
I have tried readercycle thinking it was just stale readers.  That did
not work.

If anyone has any experience or knows of any documentation that can
get me closer to achieving this, I would greatly appreciate it.

Paul

Re: Reduction of open files

Posted by Grant Ingersoll <gs...@apache.org>.
That is weird.  Can you try running Lucene's CheckIndex tool on the  
index:  http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/index/CheckIndex.html

It should be in the Lucene core library that is shipped w/ Solr.


On Oct 16, 2008, at 4:27 PM, Paul deGrandis wrote:

> My biggest concern is why do the remaining files stay open even if my
> mergeFactor is 2.
>
> I would expect to see one or two segment files and one or two sets of
> accompanying file (.nrm, .frq, etc), based on the documentation.
>
> Paul
>
> On Thu, Oct 16, 2008 at 4:23 PM, Paul deGrandis
> <pa...@gmail.com> wrote:
>> I currently am not.
>>
>> The document collection is highly volatile (3000 modifications a
>> minute) and from reading thought it would be too much of a  
>> performance
>> penalty but never tested it.
>>
>> What behavior in terms of file creation and open fd is seen when
>> useCompoundFile is set to true?
>>
>> Paul
>>
>>
>> On Thu, Oct 16, 2008 at 4:16 PM, Grant Ingersoll  
>> <gs...@apache.org> wrote:
>>> Are you using the compound file format?
>>>
>>> -Grant
>>>
>>> On Oct 16, 2008, at 3:28 PM, Paul deGrandis wrote:
>>>
>>>> I have been working with SOLR for a few months now.  According to  
>>>> some
>>>> documentation I read, segment files only have one set of all the  
>>>> other
>>>> lingustic module type of stuff (normalization, frequency), is  
>>>> there a
>>>> way to remove/reduce the files not associated with a segment  
>>>> besides
>>>> optimizing the index?
>>>>
>>>> I set my mergeFactor to 2 for sake of trying to tease out a  
>>>> solution.
>>>> I have tried readercycle thinking it was just stale readers.   
>>>> That did
>>>> not work.
>>>>
>>>> If anyone has any experience or knows of any documentation that can
>>>> get me closer to achieving this, I would greatly appreciate it.
>>>>
>>>> Paul
>>>
>>> --------------------------
>>> Grant Ingersoll
>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
>>> http://www.lucenebootcamp.com
>>>
>>>
>>> Lucene Helpful Hints:
>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>

--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ










Re: Reduction of open files

Posted by Paul deGrandis <pa...@gmail.com>.
My biggest concern is why do the remaining files stay open even if my
mergeFactor is 2.

I would expect to see one or two segment files and one or two sets of
accompanying file (.nrm, .frq, etc), based on the documentation.

Paul

On Thu, Oct 16, 2008 at 4:23 PM, Paul deGrandis
<pa...@gmail.com> wrote:
> I currently am not.
>
> The document collection is highly volatile (3000 modifications a
> minute) and from reading thought it would be too much of a performance
> penalty but never tested it.
>
> What behavior in terms of file creation and open fd is seen when
> useCompoundFile is set to true?
>
> Paul
>
>
> On Thu, Oct 16, 2008 at 4:16 PM, Grant Ingersoll <gs...@apache.org> wrote:
>> Are you using the compound file format?
>>
>> -Grant
>>
>> On Oct 16, 2008, at 3:28 PM, Paul deGrandis wrote:
>>
>>> I have been working with SOLR for a few months now.  According to some
>>> documentation I read, segment files only have one set of all the other
>>> lingustic module type of stuff (normalization, frequency), is there a
>>> way to remove/reduce the files not associated with a segment besides
>>> optimizing the index?
>>>
>>> I set my mergeFactor to 2 for sake of trying to tease out a solution.
>>> I have tried readercycle thinking it was just stale readers.  That did
>>> not work.
>>>
>>> If anyone has any experience or knows of any documentation that can
>>> get me closer to achieving this, I would greatly appreciate it.
>>>
>>> Paul
>>
>> --------------------------
>> Grant Ingersoll
>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
>> http://www.lucenebootcamp.com
>>
>>
>> Lucene Helpful Hints:
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: Reduction of open files

Posted by Paul deGrandis <pa...@gmail.com>.
I currently am not.

The document collection is highly volatile (3000 modifications a
minute) and from reading thought it would be too much of a performance
penalty but never tested it.

What behavior in terms of file creation and open fd is seen when
useCompoundFile is set to true?

Paul


On Thu, Oct 16, 2008 at 4:16 PM, Grant Ingersoll <gs...@apache.org> wrote:
> Are you using the compound file format?
>
> -Grant
>
> On Oct 16, 2008, at 3:28 PM, Paul deGrandis wrote:
>
>> I have been working with SOLR for a few months now.  According to some
>> documentation I read, segment files only have one set of all the other
>> lingustic module type of stuff (normalization, frequency), is there a
>> way to remove/reduce the files not associated with a segment besides
>> optimizing the index?
>>
>> I set my mergeFactor to 2 for sake of trying to tease out a solution.
>> I have tried readercycle thinking it was just stale readers.  That did
>> not work.
>>
>> If anyone has any experience or knows of any documentation that can
>> get me closer to achieving this, I would greatly appreciate it.
>>
>> Paul
>
> --------------------------
> Grant Ingersoll
> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
> http://www.lucenebootcamp.com
>
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
>
>

Re: Reduction of open files

Posted by Grant Ingersoll <gs...@apache.org>.
Are you using the compound file format?

-Grant

On Oct 16, 2008, at 3:28 PM, Paul deGrandis wrote:

> I have been working with SOLR for a few months now.  According to some
> documentation I read, segment files only have one set of all the other
> lingustic module type of stuff (normalization, frequency), is there a
> way to remove/reduce the files not associated with a segment besides
> optimizing the index?
>
> I set my mergeFactor to 2 for sake of trying to tease out a solution.
> I have tried readercycle thinking it was just stale readers.  That did
> not work.
>
> If anyone has any experience or knows of any documentation that can
> get me closer to achieving this, I would greatly appreciate it.
>
> Paul

--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ