You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by samarth s <sa...@gmail.com> on 2011/10/16 20:01:31 UTC

Solr Open File Descriptors

Hi,

Is it safe to assume that with a megeFactor of 10 the open file descriptors
required by solr would be around (1+ 10) * 10 = 110
ref: *http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*
Solr wiki:
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates
that FD's required per segment is around 7.

Are these estimates appropriate. Does it in anyway depend on the size of the
index & number of docs (assuming same number of segments in any case) as
well?


-- 
Regards,
Samarth

Re: Solr Open File Descriptors

Posted by samarth s <sa...@gmail.com>.
Thanks for sharing your insights shawn

On Mon, Oct 17, 2011 at 1:27 AM, Shawn Heisey <so...@elyograg.org> wrote:

> On 10/16/2011 12:01 PM, samarth s wrote:
>
>> Hi,
>>
>> Is it safe to assume that with a megeFactor of 10 the open file
>> descriptors
>> required by solr would be around (1+ 10) * 10 = 110
>> ref: *http://onjava.com/pub/a/**onjava/2003/03/05/lucene.html#**
>> indexing_speed*<http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*>
>> Solr wiki:
>> http://wiki.apache.org/solr/**SolrPerformanceFactors#**Optimization_**
>> Considerationsstates<http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates>
>>
>> that FD's required per segment is around 7.
>>
>> Are these estimates appropriate. Does it in anyway depend on the size of
>> the
>> index&  number of docs (assuming same number of segments in any case) as
>> well?
>>
>
> My index has 10 files per normal  segment (the usual 7 plus three more for
> termvectors).  Some of the segments also have a ".del" file, and there is a
> segments_* file and a segments.gen file.  Your servlet container and other
> parts of the OS will also have to open files.
>
> I have personally seen three levels of segment merging taking place at the
> same time on a slow filesystem during a full-import, along with new content
> coming in at the same time.  With a mergefactor of 10, each merge is 11
> segments - the ten that are being merged and the merged segment.  If you
> have three going on at the same time, that's 33 segments, and you can have
> up to 10 more that are actively being built by ongoing index activity, so
> that's 43 potential segments.  If your filesystem is REALLY slow, you might
> end up with even more segments as existing merges are paused for new ones to
> start, but if you run into that, you'll want to udpate your hardware, so I
> won't consider it.
>
> Multiplying 43 segments by 11 files per segment yields a working
> theoretical maximum of 473 files.  Add in the segments files, you're up to
> 475.
>
> Most operating systems have a default FD limit that's at least 1024.  If
> you only have one index (core) on your Solr server, Solr is the only thing
> running on that server, and it's using the default mergeFactor of 10, you
> should be fine with the default.  If you are going to have more than one
> index on your Solr server (such as a build core and a live core), you plan
> to run other things on the server, or you want to increase your mergeFactor
> significantly, you might need to adjust the OS configuration to allow more
> file descriptors.
>
> Thanks,
> Shawn
>
>


-- 
Regards,
Samarth

Re: Solr Open File Descriptors

Posted by Shawn Heisey <so...@elyograg.org>.
On 10/16/2011 12:01 PM, samarth s wrote:
> Hi,
>
> Is it safe to assume that with a megeFactor of 10 the open file descriptors
> required by solr would be around (1+ 10) * 10 = 110
> ref: *http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*
> Solr wiki:
> http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates
> that FD's required per segment is around 7.
>
> Are these estimates appropriate. Does it in anyway depend on the size of the
> index&  number of docs (assuming same number of segments in any case) as
> well?

My index has 10 files per normal  segment (the usual 7 plus three more 
for termvectors).  Some of the segments also have a ".del" file, and 
there is a segments_* file and a segments.gen file.  Your servlet 
container and other parts of the OS will also have to open files.

I have personally seen three levels of segment merging taking place at 
the same time on a slow filesystem during a full-import, along with new 
content coming in at the same time.  With a mergefactor of 10, each 
merge is 11 segments - the ten that are being merged and the merged 
segment.  If you have three going on at the same time, that's 33 
segments, and you can have up to 10 more that are actively being built 
by ongoing index activity, so that's 43 potential segments.  If your 
filesystem is REALLY slow, you might end up with even more segments as 
existing merges are paused for new ones to start, but if you run into 
that, you'll want to udpate your hardware, so I won't consider it.

Multiplying 43 segments by 11 files per segment yields a working 
theoretical maximum of 473 files.  Add in the segments files, you're up 
to 475.

Most operating systems have a default FD limit that's at least 1024.  If 
you only have one index (core) on your Solr server, Solr is the only 
thing running on that server, and it's using the default mergeFactor of 
10, you should be fine with the default.  If you are going to have more 
than one index on your Solr server (such as a build core and a live 
core), you plan to run other things on the server, or you want to 
increase your mergeFactor significantly, you might need to adjust the OS 
configuration to allow more file descriptors.

Thanks,
Shawn