You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by samarth s <sa...@gmail.com> on 2011/10/16 20:01:31 UTC
Solr Open File Descriptors
Hi,
Is it safe to assume that with a megeFactor of 10 the open file descriptors
required by solr would be around (1+ 10) * 10 = 110
ref: *http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*
Solr wiki:
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates
that FD's required per segment is around 7.
Are these estimates appropriate. Does it in anyway depend on the size of the
index & number of docs (assuming same number of segments in any case) as
well?
--
Regards,
Samarth
Re: Solr Open File Descriptors
Posted by samarth s <sa...@gmail.com>.
Thanks for sharing your insights shawn
On Mon, Oct 17, 2011 at 1:27 AM, Shawn Heisey <so...@elyograg.org> wrote:
> On 10/16/2011 12:01 PM, samarth s wrote:
>
>> Hi,
>>
>> Is it safe to assume that with a megeFactor of 10 the open file
>> descriptors
>> required by solr would be around (1+ 10) * 10 = 110
>> ref: *http://onjava.com/pub/a/**onjava/2003/03/05/lucene.html#**
>> indexing_speed*<http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*>
>> Solr wiki:
>> http://wiki.apache.org/solr/**SolrPerformanceFactors#**Optimization_**
>> Considerationsstates<http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates>
>>
>> that FD's required per segment is around 7.
>>
>> Are these estimates appropriate. Does it in anyway depend on the size of
>> the
>> index& number of docs (assuming same number of segments in any case) as
>> well?
>>
>
> My index has 10 files per normal segment (the usual 7 plus three more for
> termvectors). Some of the segments also have a ".del" file, and there is a
> segments_* file and a segments.gen file. Your servlet container and other
> parts of the OS will also have to open files.
>
> I have personally seen three levels of segment merging taking place at the
> same time on a slow filesystem during a full-import, along with new content
> coming in at the same time. With a mergefactor of 10, each merge is 11
> segments - the ten that are being merged and the merged segment. If you
> have three going on at the same time, that's 33 segments, and you can have
> up to 10 more that are actively being built by ongoing index activity, so
> that's 43 potential segments. If your filesystem is REALLY slow, you might
> end up with even more segments as existing merges are paused for new ones to
> start, but if you run into that, you'll want to udpate your hardware, so I
> won't consider it.
>
> Multiplying 43 segments by 11 files per segment yields a working
> theoretical maximum of 473 files. Add in the segments files, you're up to
> 475.
>
> Most operating systems have a default FD limit that's at least 1024. If
> you only have one index (core) on your Solr server, Solr is the only thing
> running on that server, and it's using the default mergeFactor of 10, you
> should be fine with the default. If you are going to have more than one
> index on your Solr server (such as a build core and a live core), you plan
> to run other things on the server, or you want to increase your mergeFactor
> significantly, you might need to adjust the OS configuration to allow more
> file descriptors.
>
> Thanks,
> Shawn
>
>
--
Regards,
Samarth
Re: Solr Open File Descriptors
Posted by Shawn Heisey <so...@elyograg.org>.
On 10/16/2011 12:01 PM, samarth s wrote:
> Hi,
>
> Is it safe to assume that with a megeFactor of 10 the open file descriptors
> required by solr would be around (1+ 10) * 10 = 110
> ref: *http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*
> Solr wiki:
> http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates
> that FD's required per segment is around 7.
>
> Are these estimates appropriate. Does it in anyway depend on the size of the
> index& number of docs (assuming same number of segments in any case) as
> well?
My index has 10 files per normal segment (the usual 7 plus three more
for termvectors). Some of the segments also have a ".del" file, and
there is a segments_* file and a segments.gen file. Your servlet
container and other parts of the OS will also have to open files.
I have personally seen three levels of segment merging taking place at
the same time on a slow filesystem during a full-import, along with new
content coming in at the same time. With a mergefactor of 10, each
merge is 11 segments - the ten that are being merged and the merged
segment. If you have three going on at the same time, that's 33
segments, and you can have up to 10 more that are actively being built
by ongoing index activity, so that's 43 potential segments. If your
filesystem is REALLY slow, you might end up with even more segments as
existing merges are paused for new ones to start, but if you run into
that, you'll want to udpate your hardware, so I won't consider it.
Multiplying 43 segments by 11 files per segment yields a working
theoretical maximum of 473 files. Add in the segments files, you're up
to 475.
Most operating systems have a default FD limit that's at least 1024. If
you only have one index (core) on your Solr server, Solr is the only
thing running on that server, and it's using the default mergeFactor of
10, you should be fine with the default. If you are going to have more
than one index on your Solr server (such as a build core and a live
core), you plan to run other things on the server, or you want to
increase your mergeFactor significantly, you might need to adjust the OS
configuration to allow more file descriptors.
Thanks,
Shawn