You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Frank Wennerdahl <fr...@arcadelia.com> on 2013/03/25 18:18:48 UTC

Scaling Solr on VMWare

Hi.

 

We are currently benchmarking our Solr setup and are having trouble with
scaling hardware for a single Solr instance. We want to investigate how one
instance scales with hardware to find the optimal ratio of hardware vs
sharding when scaling. Our main problem is that we cannot identify any
hardware limitations, CPU is far from maxed out, disk I/O is not an issue as
far as we can see and there is plenty of RAM available.

 

In short we have a couple of questions that we hope someone here could help
us with. Detailed information about our setup, use case and things we've
tried is provided below the questions.

 

Questions:

1.       What could cause Solr to utilize only 2 CPU cores when sending
multiple update requests in parallel in a VMWare environment?

2.       Is there a software limit on the number of CPU cores that Solr can
utilize while indexing?

3.       Ruling out network and disk performance, what could cause a
decrease in indexing speed when sending data over a network as opposed to
sending it from the local machine?

 

We are running on three cores per Solr instance, however only one core
receives any non-trivial load. We are using VMWare (ESX 5.0) virtual
machines for hosting Solr and a QNAP NAS containing 12 HDDs in a RAID5 setup
for storage. Our data consists of a huge amount of small-sized documents.
When indexing we are using Solr's javabin format (although not through
Solrj, we have implemented the format in C#/.NET) and our batch size is
currently 1000 documents. The actual size of the data varies, but the
batches we have used range from approximately 450KB to 1050KB. We're sending
these batches to Solr in parallel using a number of send threads.

 

There are two issues that we've run into:

1.       When sending data from one VM to Solr on another VM we observed
that Solr did not seem to utilize CPU cores properly. The Solr VM had 8
vCPUs available and we were using 4 threads sending data in parallel. We saw
a low (~29%)  CPU utilization on the Solr VM with 2 cores doing almost all
the work while the remaining cores remained almost idle. Increasing the
number of send threads to 8 yielded the same result, capping our indexing
speed to about 4.88MB per second. The client VM had 4 vCPUs which were
hardly utilized as we were reading data from pre-generated files. 

To rule out network limitations we sent the test data to a server on the
Solr VM that simply accepted the request and returned an empty response. We
were able to send data at 219MB per second, so the network did not seem to
be the bottleneck. We also tested sending data to Solr locally from the Solr
VM to see if disk I/O was the problem. Surprisingly we were able to index
significantly faster at 7.34MB per second using 4 send threads (8.4MB with 6
send threads) which indicated that the disk was not slowing us down when
sending data over the network. Worth noting is that the CPU utilization was
now higher (47,81% with 4 threads, 58,8% with 6) and the work was spread out
over all cores. As before we used pre-generated files and the process
sending the data used almost no CPU.

2.       We decided to investigate how Solr would scale with additional
vCPUs when indexing locally. We increased the number of vCPUs to 16 and the
number of send threads to 8. Sadly we now experienced a decrease in
performance: 7MB/s with 8 threads, 6.4MB/s with 12 threads and 4.95/s with
16 threads. The CPU usage was in average 30%, regardless of the number of
threads used. We know that additional vCPUs can cause decreased performance
in VMWare virtual machines due to time waiting for CPUs to become available.
We investigated this using esxtop which only showed a 1% CSTP. According to
VMWare
<http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=di
splayKC&externalId=1005362>  a CSTP above 3% could indictate that multiple
vCPUs are causing performance issues.

We noticed that the average disk write speed seemed to cap at around 11.5
million bytes per second so we tested the same VM setup using a faster disk.
This did not yield any increase in performance (it was actually somewhat
slower), neither did using a RAM-mapped drive for Solr.

 

Any help or ideas of what could be the bottleneck in our setup would be
greatly appreciated!

 

Best regards,

Frank Wennerdahl

Developer

Arcadelia AB


Re: Scaling Solr on VMWare

Posted by Peter Sturge <pe...@gmail.com>.
Hi,

We have run solr in VM environments extensively (3.6 not Cloud, but the
issues will be similar).
There are some significant things to be aware of when running Solr in a
virtualized environment (these can be equally true with Hyper-V and Xen as
well):
If you're doing heavy indexing, the networking can be a real bottleneck,
depending on the environment.
If you're using a virtual cluster, and you have other VMs that use lots of
network and/or CPU (e.g. a SQL Server, email etc.), you will encounter
performance issues (note: it's generally a good idea to tie a Solr instance
to a physical machine in the cluster).
Using virtual switches can, in some instances, create network bottlenecks,
particularly with high input indexing. There are myriad scenarios for
vSwitches, so it's not practical to go into all the possible scenarios here
- but the general rule is - be careful!
CPU context switching can have a huge impact on Solr, so assigning CPUs,
cores and virtual cores needs some care to ensure there's enough CPU
resource to get the jobs done, but not so many the VM is continually
waiting for cores to become free (VMWare will wait until all configured
core slots are free before proceeding with a request).

The above scratches the surface of running multi-threaded production
applications like Solr in a virtual environment, but hopefully it can
provide a staring point.

Thanks,
Peter



On Wed, Apr 17, 2013 at 11:56 AM, adfel70 <ad...@gmail.com> wrote:

> Hi
> We are currently considering running solr cloud on vmware.
> Di you have any insights regarding the issue you encountered and generally
> regarding using virtual machines instead of physical machines for solr
> cloud?
>
>
> Frank Wennerdahl wrote
> > Hi Otis and thanks for your response.
> >
> > We are indeed suspecting that the problem with only 2 cores being used
> > might
> > be caused by the virtual environment. We're hoping that someone with
> > experience of running Solr on VMWare might know more about this or the
> > other
> > issues we have.
> >
> > The servlet we're running is the bundled Jetty servlet (Solr version
> 4.1).
> > As we have seen a higher number of CPU cores utilized when sending data
> to
> > Solr locally it seems that the servlet isn't restricting the number of
> > threads used.
> >
> > Frank
> >
> > -----Original Message-----
> > From: Otis Gospodnetic [mailto:
>
> > otis.gospodnetic@
>
> > ]
> > Sent: den 26 mars 2013 05:09
> > To:
>
> > solr-user@.apache
>
> > Subject: Re: Scaling Solr on VMWare
> >
> > Hi Frank,
> >
> > If your servlet container had a crazy low setting for the max number of
> > threads I think you would see the CPU underutilized.  But I think you
> > would
> > also see errors in on the client about connections being requested.
> > Sounds
> > like a possibly VM issue that's not Solr-specific...
> >
> > Otis
> > --
> > Solr & ElasticSearch Support
> > http://sematext.com/
> >
> >
> >
> >
> >
> > On Mon, Mar 25, 2013 at 1:18 PM, Frank Wennerdahl
> > &lt;
>
> > frank.wennerdahl@
>
> > &gt; wrote:
> >> Hi.
> >>
> >>
> >>
> >> We are currently benchmarking our Solr setup and are having trouble
> >> with scaling hardware for a single Solr instance. We want to
> >> investigate how one instance scales with hardware to find the optimal
> >> ratio of hardware vs sharding when scaling. Our main problem is that
> >> we cannot identify any hardware limitations, CPU is far from maxed
> >> out, disk I/O is not an issue as far as we can see and there is plenty
> of
> > RAM available.
> >>
> >>
> >>
> >> In short we have a couple of questions that we hope someone here could
> >> help us with. Detailed information about our setup, use case and
> >> things we've tried is provided below the questions.
> >>
> >>
> >>
> >> Questions:
> >>
> >> 1.       What could cause Solr to utilize only 2 CPU cores when sending
> >> multiple update requests in parallel in a VMWare environment?
> >>
> >> 2.       Is there a software limit on the number of CPU cores that Solr
> > can
> >> utilize while indexing?
> >>
> >> 3.       Ruling out network and disk performance, what could cause a
> >> decrease in indexing speed when sending data over a network as opposed
> >> to sending it from the local machine?
> >>
> >>
> >>
> >> We are running on three cores per Solr instance, however only one core
> >> receives any non-trivial load. We are using VMWare (ESX 5.0) virtual
> >> machines for hosting Solr and a QNAP NAS containing 12 HDDs in a RAID5
> >> setup for storage. Our data consists of a huge amount of small-sized
> > documents.
> >> When indexing we are using Solr's javabin format (although not through
> >> Solrj, we have implemented the format in C#/.NET) and our batch size
> >> is currently 1000 documents. The actual size of the data varies, but
> >> the batches we have used range from approximately 450KB to 1050KB.
> >> We're sending these batches to Solr in parallel using a number of send
> > threads.
> >>
> >>
> >>
> >> There are two issues that we've run into:
> >>
> >> 1.       When sending data from one VM to Solr on another VM we observed
> >> that Solr did not seem to utilize CPU cores properly. The Solr VM had
> >> 8 vCPUs available and we were using 4 threads sending data in
> >> parallel. We saw a low (~29%)  CPU utilization on the Solr VM with 2
> >> cores doing almost all the work while the remaining cores remained
> >> almost idle. Increasing the number of send threads to 8 yielded the
> >> same result, capping our indexing speed to about 4.88MB per second.
> >> The client VM had 4 vCPUs which were hardly utilized as we were reading
> > data from pre-generated files.
> >>
> >> To rule out network limitations we sent the test data to a server on
> >> the Solr VM that simply accepted the request and returned an empty
> >> response. We were able to send data at 219MB per second, so the
> >> network did not seem to be the bottleneck. We also tested sending data
> >> to Solr locally from the Solr VM to see if disk I/O was the problem.
> >> Surprisingly we were able to index significantly faster at 7.34MB per
> >> second using 4 send threads (8.4MB with 6 send threads) which
> >> indicated that the disk was not slowing us down when sending data over
> >> the network. Worth noting is that the CPU utilization was now higher
> >> (47,81% with 4 threads, 58,8% with 6) and the work was spread out over
> >> all cores. As before we used pre-generated files and the process sending
> > the data used almost no CPU.
> >>
> >> 2.       We decided to investigate how Solr would scale with additional
> >> vCPUs when indexing locally. We increased the number of vCPUs to 16
> >> and the number of send threads to 8. Sadly we now experienced a
> >> decrease in
> >> performance: 7MB/s with 8 threads, 6.4MB/s with 12 threads and 4.95/s
> >> with
> >> 16 threads. The CPU usage was in average 30%, regardless of the number
> >> of threads used. We know that additional vCPUs can cause decreased
> >> performance in VMWare virtual machines due to time waiting for CPUs to
> > become available.
> >> We investigated this using esxtop which only showed a 1% CSTP.
> >> According to VMWare
> >> &lt;
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&amp;
> > &gt; cmd=di splayKC&externalId=1005362>  a CSTP above 3% could indictate
> >> that multiple vCPUs are causing performance issues.
> >>
> >> We noticed that the average disk write speed seemed to cap at around
> >> 11.5 million bytes per second so we tested the same VM setup using a
> > faster disk.
> >> This did not yield any increase in performance (it was actually
> >> somewhat slower), neither did using a RAM-mapped drive for Solr.
> >>
> >>
> >>
> >> Any help or ideas of what could be the bottleneck in our setup would
> >> be greatly appreciated!
> >>
> >>
> >>
> >> Best regards,
> >>
> >> Frank Wennerdahl
> >>
> >> Developer
> >>
> >> Arcadelia AB
> >>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Scaling-Solr-on-VMWare-tp4051153p4056637.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

RE: Scaling Solr on VMWare

Posted by adfel70 <ad...@gmail.com>.
Hi
We are currently considering running solr cloud on vmware.
Di you have any insights regarding the issue you encountered and generally
regarding using virtual machines instead of physical machines for solr
cloud?


Frank Wennerdahl wrote
> Hi Otis and thanks for your response.
> 
> We are indeed suspecting that the problem with only 2 cores being used
> might
> be caused by the virtual environment. We're hoping that someone with
> experience of running Solr on VMWare might know more about this or the
> other
> issues we have.
> 
> The servlet we're running is the bundled Jetty servlet (Solr version 4.1).
> As we have seen a higher number of CPU cores utilized when sending data to
> Solr locally it seems that the servlet isn't restricting the number of
> threads used.
> 
> Frank
> 
> -----Original Message-----
> From: Otis Gospodnetic [mailto:

> otis.gospodnetic@

> ] 
> Sent: den 26 mars 2013 05:09
> To: 

> solr-user@.apache

> Subject: Re: Scaling Solr on VMWare
> 
> Hi Frank,
> 
> If your servlet container had a crazy low setting for the max number of
> threads I think you would see the CPU underutilized.  But I think you
> would
> also see errors in on the client about connections being requested. 
> Sounds
> like a possibly VM issue that's not Solr-specific...
> 
> Otis
> --
> Solr & ElasticSearch Support
> http://sematext.com/
> 
> 
> 
> 
> 
> On Mon, Mar 25, 2013 at 1:18 PM, Frank Wennerdahl
> &lt;

> frank.wennerdahl@

> &gt; wrote:
>> Hi.
>>
>>
>>
>> We are currently benchmarking our Solr setup and are having trouble 
>> with scaling hardware for a single Solr instance. We want to 
>> investigate how one instance scales with hardware to find the optimal 
>> ratio of hardware vs sharding when scaling. Our main problem is that 
>> we cannot identify any hardware limitations, CPU is far from maxed 
>> out, disk I/O is not an issue as far as we can see and there is plenty of
> RAM available.
>>
>>
>>
>> In short we have a couple of questions that we hope someone here could 
>> help us with. Detailed information about our setup, use case and 
>> things we've tried is provided below the questions.
>>
>>
>>
>> Questions:
>>
>> 1.       What could cause Solr to utilize only 2 CPU cores when sending
>> multiple update requests in parallel in a VMWare environment?
>>
>> 2.       Is there a software limit on the number of CPU cores that Solr
> can
>> utilize while indexing?
>>
>> 3.       Ruling out network and disk performance, what could cause a
>> decrease in indexing speed when sending data over a network as opposed 
>> to sending it from the local machine?
>>
>>
>>
>> We are running on three cores per Solr instance, however only one core 
>> receives any non-trivial load. We are using VMWare (ESX 5.0) virtual 
>> machines for hosting Solr and a QNAP NAS containing 12 HDDs in a RAID5 
>> setup for storage. Our data consists of a huge amount of small-sized
> documents.
>> When indexing we are using Solr's javabin format (although not through 
>> Solrj, we have implemented the format in C#/.NET) and our batch size 
>> is currently 1000 documents. The actual size of the data varies, but 
>> the batches we have used range from approximately 450KB to 1050KB. 
>> We're sending these batches to Solr in parallel using a number of send
> threads.
>>
>>
>>
>> There are two issues that we've run into:
>>
>> 1.       When sending data from one VM to Solr on another VM we observed
>> that Solr did not seem to utilize CPU cores properly. The Solr VM had 
>> 8 vCPUs available and we were using 4 threads sending data in 
>> parallel. We saw a low (~29%)  CPU utilization on the Solr VM with 2 
>> cores doing almost all the work while the remaining cores remained 
>> almost idle. Increasing the number of send threads to 8 yielded the 
>> same result, capping our indexing speed to about 4.88MB per second. 
>> The client VM had 4 vCPUs which were hardly utilized as we were reading
> data from pre-generated files.
>>
>> To rule out network limitations we sent the test data to a server on 
>> the Solr VM that simply accepted the request and returned an empty 
>> response. We were able to send data at 219MB per second, so the 
>> network did not seem to be the bottleneck. We also tested sending data 
>> to Solr locally from the Solr VM to see if disk I/O was the problem. 
>> Surprisingly we were able to index significantly faster at 7.34MB per 
>> second using 4 send threads (8.4MB with 6 send threads) which 
>> indicated that the disk was not slowing us down when sending data over 
>> the network. Worth noting is that the CPU utilization was now higher 
>> (47,81% with 4 threads, 58,8% with 6) and the work was spread out over 
>> all cores. As before we used pre-generated files and the process sending
> the data used almost no CPU.
>>
>> 2.       We decided to investigate how Solr would scale with additional
>> vCPUs when indexing locally. We increased the number of vCPUs to 16 
>> and the number of send threads to 8. Sadly we now experienced a 
>> decrease in
>> performance: 7MB/s with 8 threads, 6.4MB/s with 12 threads and 4.95/s 
>> with
>> 16 threads. The CPU usage was in average 30%, regardless of the number 
>> of threads used. We know that additional vCPUs can cause decreased 
>> performance in VMWare virtual machines due to time waiting for CPUs to
> become available.
>> We investigated this using esxtop which only showed a 1% CSTP. 
>> According to VMWare 
>> &lt;http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&amp;
> &gt; cmd=di splayKC&externalId=1005362>  a CSTP above 3% could indictate 
>> that multiple vCPUs are causing performance issues.
>>
>> We noticed that the average disk write speed seemed to cap at around 
>> 11.5 million bytes per second so we tested the same VM setup using a
> faster disk.
>> This did not yield any increase in performance (it was actually 
>> somewhat slower), neither did using a RAM-mapped drive for Solr.
>>
>>
>>
>> Any help or ideas of what could be the bottleneck in our setup would 
>> be greatly appreciated!
>>
>>
>>
>> Best regards,
>>
>> Frank Wennerdahl
>>
>> Developer
>>
>> Arcadelia AB
>>





--
View this message in context: http://lucene.472066.n3.nabble.com/Scaling-Solr-on-VMWare-tp4051153p4056637.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Scaling Solr on VMWare

Posted by Frank Wennerdahl <fr...@arcadelia.com>.
Hi Otis and thanks for your response.

We are indeed suspecting that the problem with only 2 cores being used might
be caused by the virtual environment. We're hoping that someone with
experience of running Solr on VMWare might know more about this or the other
issues we have.

The servlet we're running is the bundled Jetty servlet (Solr version 4.1).
As we have seen a higher number of CPU cores utilized when sending data to
Solr locally it seems that the servlet isn't restricting the number of
threads used.

Frank

-----Original Message-----
From: Otis Gospodnetic [mailto:otis.gospodnetic@gmail.com] 
Sent: den 26 mars 2013 05:09
To: solr-user@lucene.apache.org
Subject: Re: Scaling Solr on VMWare

Hi Frank,

If your servlet container had a crazy low setting for the max number of
threads I think you would see the CPU underutilized.  But I think you would
also see errors in on the client about connections being requested.  Sounds
like a possibly VM issue that's not Solr-specific...

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Mon, Mar 25, 2013 at 1:18 PM, Frank Wennerdahl
<fr...@arcadelia.com> wrote:
> Hi.
>
>
>
> We are currently benchmarking our Solr setup and are having trouble 
> with scaling hardware for a single Solr instance. We want to 
> investigate how one instance scales with hardware to find the optimal 
> ratio of hardware vs sharding when scaling. Our main problem is that 
> we cannot identify any hardware limitations, CPU is far from maxed 
> out, disk I/O is not an issue as far as we can see and there is plenty of
RAM available.
>
>
>
> In short we have a couple of questions that we hope someone here could 
> help us with. Detailed information about our setup, use case and 
> things we've tried is provided below the questions.
>
>
>
> Questions:
>
> 1.       What could cause Solr to utilize only 2 CPU cores when sending
> multiple update requests in parallel in a VMWare environment?
>
> 2.       Is there a software limit on the number of CPU cores that Solr
can
> utilize while indexing?
>
> 3.       Ruling out network and disk performance, what could cause a
> decrease in indexing speed when sending data over a network as opposed 
> to sending it from the local machine?
>
>
>
> We are running on three cores per Solr instance, however only one core 
> receives any non-trivial load. We are using VMWare (ESX 5.0) virtual 
> machines for hosting Solr and a QNAP NAS containing 12 HDDs in a RAID5 
> setup for storage. Our data consists of a huge amount of small-sized
documents.
> When indexing we are using Solr's javabin format (although not through 
> Solrj, we have implemented the format in C#/.NET) and our batch size 
> is currently 1000 documents. The actual size of the data varies, but 
> the batches we have used range from approximately 450KB to 1050KB. 
> We're sending these batches to Solr in parallel using a number of send
threads.
>
>
>
> There are two issues that we've run into:
>
> 1.       When sending data from one VM to Solr on another VM we observed
> that Solr did not seem to utilize CPU cores properly. The Solr VM had 
> 8 vCPUs available and we were using 4 threads sending data in 
> parallel. We saw a low (~29%)  CPU utilization on the Solr VM with 2 
> cores doing almost all the work while the remaining cores remained 
> almost idle. Increasing the number of send threads to 8 yielded the 
> same result, capping our indexing speed to about 4.88MB per second. 
> The client VM had 4 vCPUs which were hardly utilized as we were reading
data from pre-generated files.
>
> To rule out network limitations we sent the test data to a server on 
> the Solr VM that simply accepted the request and returned an empty 
> response. We were able to send data at 219MB per second, so the 
> network did not seem to be the bottleneck. We also tested sending data 
> to Solr locally from the Solr VM to see if disk I/O was the problem. 
> Surprisingly we were able to index significantly faster at 7.34MB per 
> second using 4 send threads (8.4MB with 6 send threads) which 
> indicated that the disk was not slowing us down when sending data over 
> the network. Worth noting is that the CPU utilization was now higher 
> (47,81% with 4 threads, 58,8% with 6) and the work was spread out over 
> all cores. As before we used pre-generated files and the process sending
the data used almost no CPU.
>
> 2.       We decided to investigate how Solr would scale with additional
> vCPUs when indexing locally. We increased the number of vCPUs to 16 
> and the number of send threads to 8. Sadly we now experienced a 
> decrease in
> performance: 7MB/s with 8 threads, 6.4MB/s with 12 threads and 4.95/s 
> with
> 16 threads. The CPU usage was in average 30%, regardless of the number 
> of threads used. We know that additional vCPUs can cause decreased 
> performance in VMWare virtual machines due to time waiting for CPUs to
become available.
> We investigated this using esxtop which only showed a 1% CSTP. 
> According to VMWare 
> <http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&
> cmd=di splayKC&externalId=1005362>  a CSTP above 3% could indictate 
> that multiple vCPUs are causing performance issues.
>
> We noticed that the average disk write speed seemed to cap at around 
> 11.5 million bytes per second so we tested the same VM setup using a
faster disk.
> This did not yield any increase in performance (it was actually 
> somewhat slower), neither did using a RAM-mapped drive for Solr.
>
>
>
> Any help or ideas of what could be the bottleneck in our setup would 
> be greatly appreciated!
>
>
>
> Best regards,
>
> Frank Wennerdahl
>
> Developer
>
> Arcadelia AB
>


Re: Scaling Solr on VMWare

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi Frank,

If your servlet container had a crazy low setting for the max number
of threads I think you would see the CPU underutilized.  But I think
you would also see errors in on the client about connections being
requested.  Sounds like a possibly VM issue that's not
Solr-specific...

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Mon, Mar 25, 2013 at 1:18 PM, Frank Wennerdahl
<fr...@arcadelia.com> wrote:
> Hi.
>
>
>
> We are currently benchmarking our Solr setup and are having trouble with
> scaling hardware for a single Solr instance. We want to investigate how one
> instance scales with hardware to find the optimal ratio of hardware vs
> sharding when scaling. Our main problem is that we cannot identify any
> hardware limitations, CPU is far from maxed out, disk I/O is not an issue as
> far as we can see and there is plenty of RAM available.
>
>
>
> In short we have a couple of questions that we hope someone here could help
> us with. Detailed information about our setup, use case and things we've
> tried is provided below the questions.
>
>
>
> Questions:
>
> 1.       What could cause Solr to utilize only 2 CPU cores when sending
> multiple update requests in parallel in a VMWare environment?
>
> 2.       Is there a software limit on the number of CPU cores that Solr can
> utilize while indexing?
>
> 3.       Ruling out network and disk performance, what could cause a
> decrease in indexing speed when sending data over a network as opposed to
> sending it from the local machine?
>
>
>
> We are running on three cores per Solr instance, however only one core
> receives any non-trivial load. We are using VMWare (ESX 5.0) virtual
> machines for hosting Solr and a QNAP NAS containing 12 HDDs in a RAID5 setup
> for storage. Our data consists of a huge amount of small-sized documents.
> When indexing we are using Solr's javabin format (although not through
> Solrj, we have implemented the format in C#/.NET) and our batch size is
> currently 1000 documents. The actual size of the data varies, but the
> batches we have used range from approximately 450KB to 1050KB. We're sending
> these batches to Solr in parallel using a number of send threads.
>
>
>
> There are two issues that we've run into:
>
> 1.       When sending data from one VM to Solr on another VM we observed
> that Solr did not seem to utilize CPU cores properly. The Solr VM had 8
> vCPUs available and we were using 4 threads sending data in parallel. We saw
> a low (~29%)  CPU utilization on the Solr VM with 2 cores doing almost all
> the work while the remaining cores remained almost idle. Increasing the
> number of send threads to 8 yielded the same result, capping our indexing
> speed to about 4.88MB per second. The client VM had 4 vCPUs which were
> hardly utilized as we were reading data from pre-generated files.
>
> To rule out network limitations we sent the test data to a server on the
> Solr VM that simply accepted the request and returned an empty response. We
> were able to send data at 219MB per second, so the network did not seem to
> be the bottleneck. We also tested sending data to Solr locally from the Solr
> VM to see if disk I/O was the problem. Surprisingly we were able to index
> significantly faster at 7.34MB per second using 4 send threads (8.4MB with 6
> send threads) which indicated that the disk was not slowing us down when
> sending data over the network. Worth noting is that the CPU utilization was
> now higher (47,81% with 4 threads, 58,8% with 6) and the work was spread out
> over all cores. As before we used pre-generated files and the process
> sending the data used almost no CPU.
>
> 2.       We decided to investigate how Solr would scale with additional
> vCPUs when indexing locally. We increased the number of vCPUs to 16 and the
> number of send threads to 8. Sadly we now experienced a decrease in
> performance: 7MB/s with 8 threads, 6.4MB/s with 12 threads and 4.95/s with
> 16 threads. The CPU usage was in average 30%, regardless of the number of
> threads used. We know that additional vCPUs can cause decreased performance
> in VMWare virtual machines due to time waiting for CPUs to become available.
> We investigated this using esxtop which only showed a 1% CSTP. According to
> VMWare
> <http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=di
> splayKC&externalId=1005362>  a CSTP above 3% could indictate that multiple
> vCPUs are causing performance issues.
>
> We noticed that the average disk write speed seemed to cap at around 11.5
> million bytes per second so we tested the same VM setup using a faster disk.
> This did not yield any increase in performance (it was actually somewhat
> slower), neither did using a RAM-mapped drive for Solr.
>
>
>
> Any help or ideas of what could be the bottleneck in our setup would be
> greatly appreciated!
>
>
>
> Best regards,
>
> Frank Wennerdahl
>
> Developer
>
> Arcadelia AB
>