You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Harald Kirsch <Ha...@raytion.com> on 2014/07/10 16:30:27 UTC

Reference numbers for major page fauls per seconds, index size, query throughput

Hi everyone,

currently I am taking some performance measurements on a Solr 
installation and I am trying to figure out if what I see mostly fits 
expectations:

The data is as follows:

- solr 4.8.1
- 8 millon documents
- mostly office documents with real text content, stored
- index size on disk 90G
- full index memory mapped into virtual memory:
- this is a on a vmware server, 4 cores, 16 GB RAM

PID PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  nFLT
961 20   0 93.9g  10g 6.0g S     19 64.5 718:39.81 757k

When I start running a jmeter query test sending requests as fast a 
possible with a few threads, it peaks at about 4 qps with a real-world 
query replay of mostly 1, 2, sometimes more terms.

What I see are around 150 to 200 major page faults per second, meaning 
that Solr is not really happy with what happens to be in memory at any 
instance in time.

My hunch is that this hints at a too small RAM footprint. Much more RAM 
is needed to get the number of major page faults down.

Would anyone agree or disagree with this analysis. Someone out there 
saying "200 major page faults/second are normal, there must be another 
problem"?

Thanks,
Harald.

Re: Reference numbers for major page fauls per seconds, index size, query throughput

Posted by Harald Kirsch <Ha...@raytion.com>.

Hello Erik,

thanks for the reply. Indeed the CPUs are kind of idling during the load 
test. They are not <20% but clearly don't get far beyond 40%.

Changing the number of threads in jmeter has minor effects only on the 
qps, but increases the average latency, as soon as the threads outnumber 
the CPUs --- expected behavior I would say.

I varied the number of results returned between 20 and 10 with no 
remarkable changes in performance.

I restricted to fl=id and even this increased the throughput only 
minimally (meantime the index has 16 million, increase from 2.x qps to 
3). Jmeter reported a reduction in average transferred size from 10kByes 
to 2.5kBytes. This is not really the issue here and in the end we need 
more than the IDs in production anyway.

What really bugs me currently is that htop reports an IORR (supposed to 
be read(2) calls) of between 100 to 200 MByte/s during the load test.

This somehow runs contrary to my understanding of why Solr uses mmapped 
files. There should be no read(2) calls and certainly not 200 MB/s :-/ 
And this did not drop when I restricted to fl=id.

I will try to check this with strace to see were it is reading from.

Hints appreciated. With a bit of luck, I'll get more RAM and can compare 
then.

Thanks,
Harald.

On 12.07.2014 17:58, Erick Erickson wrote:
> If the stats you're reporting are during the load test, your CPU is
> kind of idling along at < 20% which supports your theory.
>
> Just to cover all bases, when you bump the number of threads jmeter is
> firing does it make any difference? And how many rows are you
> returning? This latter is important because to return documents, Solr
> needs to go out to disk, possibly generating your page faults
> (guessing here).
>
> One note about your index size.... it's largely useless to measure
> index on disk if for no other reason than the _stored_ data doesn't
> really count towards memory requirements for search. The *.fdt an
> d*.fdx segment files contain the stored data, so subtract them out....
>
> Speaking of which, try just returning the id (&fl=id). That should
> reduce the disk seeks due to assembling the docs.
>
> But 4 qps for simple term queries seems very slow at first blush.
>
> FWIW,
> Erick
>
> On Thu, Jul 10, 2014 at 7:30 AM, Harald Kirsch
> <Ha...@raytion.com> wrote:
>> Hi everyone,
>>
>> currently I am taking some performance measurements on a Solr installation
>> and I am trying to figure out if what I see mostly fits expectations:
>>
>> The data is as follows:
>>
>> - solr 4.8.1
>> - 8 millon documents
>> - mostly office documents with real text content, stored
>> - index size on disk 90G
>> - full index memory mapped into virtual memory:
>> - this is a on a vmware server, 4 cores, 16 GB RAM
>>
>> PID PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  nFLT
>> 961 20   0 93.9g  10g 6.0g S     19 64.5 718:39.81 757k
>>
>> When I start running a jmeter query test sending requests as fast a possible
>> with a few threads, it peaks at about 4 qps with a real-world query replay
>> of mostly 1, 2, sometimes more terms.
>>
>> What I see are around 150 to 200 major page faults per second, meaning that
>> Solr is not really happy with what happens to be in memory at any instance
>> in time.
>>
>> My hunch is that this hints at a too small RAM footprint. Much more RAM is
>> needed to get the number of major page faults down.
>>
>> Would anyone agree or disagree with this analysis. Someone out there saying
>> "200 major page faults/second are normal, there must be another problem"?
>>
>> Thanks,
>> Harald.
>

Re: Reference numbers for major page fauls per seconds, index size, query throughput

Posted by Erick Erickson <er...@gmail.com>.

If the stats you're reporting are during the load test, your CPU is
kind of idling along at < 20% which supports your theory.

Just to cover all bases, when you bump the number of threads jmeter is
firing does it make any difference? And how many rows are you
returning? This latter is important because to return documents, Solr
needs to go out to disk, possibly generating your page faults
(guessing here).

One note about your index size.... it's largely useless to measure
index on disk if for no other reason than the _stored_ data doesn't
really count towards memory requirements for search. The *.fdt an
d*.fdx segment files contain the stored data, so subtract them out....

Speaking of which, try just returning the id (&fl=id). That should
reduce the disk seeks due to assembling the docs.

But 4 qps for simple term queries seems very slow at first blush.

FWIW,
Erick

On Thu, Jul 10, 2014 at 7:30 AM, Harald Kirsch
<Ha...@raytion.com> wrote:
> Hi everyone,
>
> currently I am taking some performance measurements on a Solr installation
> and I am trying to figure out if what I see mostly fits expectations:
>
> The data is as follows:
>
> - solr 4.8.1
> - 8 millon documents
> - mostly office documents with real text content, stored
> - index size on disk 90G
> - full index memory mapped into virtual memory:
> - this is a on a vmware server, 4 cores, 16 GB RAM
>
> PID PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  nFLT
> 961 20   0 93.9g  10g 6.0g S     19 64.5 718:39.81 757k
>
> When I start running a jmeter query test sending requests as fast a possible
> with a few threads, it peaks at about 4 qps with a real-world query replay
> of mostly 1, 2, sometimes more terms.
>
> What I see are around 150 to 200 major page faults per second, meaning that
> Solr is not really happy with what happens to be in memory at any instance
> in time.
>
> My hunch is that this hints at a too small RAM footprint. Much more RAM is
> needed to get the number of major page faults down.
>
> Would anyone agree or disagree with this analysis. Someone out there saying
> "200 major page faults/second are normal, there must be another problem"?
>
> Thanks,
> Harald.