You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by tictacs <he...@tactics.co.uk> on 2012/11/03 23:41:19 UTC
Solr - Disk writes and set up suggestions
Hi,
My site has 30,000 widgets and 500,000 widget users.
I have created two solr indexes, one for widgets and one for users. The
widgets index is 324MB and the users index is 9.3GB.
We are opimizing the index every hour and during this time the server is
slowing to a crawl, looks like due to the amount of disk writes - atop is
showing:
PID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK ST EXC S CPU CMD 1/2
15979 118m35s 17h28m 3.3G 1.2G 353.2G 1887.0G N- - S 0% java
8674 441m37s 637m34s 15.9G 3.2G 122.6G 10805.6G N- - S 0% java
484 189m50s 0.00s 0K 0K 94540K 91.3G N- - S 0% kjournald
868 150m19s 0.00s 0K 0K 4K 4K N- - S 0% flush-104:0
116 118m52s 0.00s 0K 0K 0K 383.9M N- - S 0% kswapd0
19079 21m38s 80m49s 33.5G 3.1G 113.2G 110.6G N- - S 0% mysqld
18955 33m22s 94.26s 77296K 9744K 66.7G 38.9G N- - S 0% perl
It is a good spec machine in terms of processor and memory - 24GB RAM and a
6 core Xeon proc but I am wondering if I have made a mistake with the disks,
it only has standard 7200 RPM SATA disks.
Would I be much better off with going for 15K RPM SAS drives? If I could
get SSD disks would they be an improvement? My current server hosts charges
for SSD drives are obscene though so that isn't likely to happen...
Currently the index that my application searches against sits on the server
where optimization takes place and search slows noticably. I could easily
run a slave on another lower powered machine but my host only has a 100 Mbps
connection between servers and I am concerned that due to the size of the
index copying it between machines will still cause disk writes on the slave
machine and I will be no better off.
Does anyone have any suggestions as to server set up to make my search fast
constantly for end users?
Cheers,
Tictacs
--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Disk-writes-and-set-up-suggestions-tp4018031.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Disk writes and set up suggestions
Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi,
This should become a FAQ. Short version: don't optimize. Check ML archives
for recent messages and explanations.
If you have a monitoring tool, look at disk io during and after
optimization, check solr cache hit rates, etc.
Otis
--
Performance Monitoring - http://sematext.com/spm
On Nov 3, 2012 6:41 PM, "tictacs" <he...@tactics.co.uk> wrote:
> Hi,
>
> My site has 30,000 widgets and 500,000 widget users.
>
> I have created two solr indexes, one for widgets and one for users. The
> widgets index is 324MB and the users index is 9.3GB.
>
> We are opimizing the index every hour and during this time the server is
> slowing to a crawl, looks like due to the amount of disk writes - atop is
> showing:
>
> PID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK ST EXC S CPU CMD 1/2
> 15979 118m35s 17h28m 3.3G 1.2G 353.2G 1887.0G N- - S 0% java
> 8674 441m37s 637m34s 15.9G 3.2G 122.6G 10805.6G N- - S 0% java
> 484 189m50s 0.00s 0K 0K 94540K 91.3G N- - S 0% kjournald
> 868 150m19s 0.00s 0K 0K 4K 4K N- - S 0%
> flush-104:0
> 116 118m52s 0.00s 0K 0K 0K 383.9M N- - S 0% kswapd0
> 19079 21m38s 80m49s 33.5G 3.1G 113.2G 110.6G N- - S 0% mysqld
> 18955 33m22s 94.26s 77296K 9744K 66.7G 38.9G N- - S 0% perl
>
> It is a good spec machine in terms of processor and memory - 24GB RAM and a
> 6 core Xeon proc but I am wondering if I have made a mistake with the
> disks,
> it only has standard 7200 RPM SATA disks.
>
> Would I be much better off with going for 15K RPM SAS drives? If I could
> get SSD disks would they be an improvement? My current server hosts
> charges
> for SSD drives are obscene though so that isn't likely to happen...
>
> Currently the index that my application searches against sits on the server
> where optimization takes place and search slows noticably. I could easily
> run a slave on another lower powered machine but my host only has a 100
> Mbps
> connection between servers and I am concerned that due to the size of the
> index copying it between machines will still cause disk writes on the slave
> machine and I will be no better off.
>
> Does anyone have any suggestions as to server set up to make my search fast
> constantly for end users?
>
> Cheers,
>
> Tictacs
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Disk-writes-and-set-up-suggestions-tp4018031.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Re: Solr - Disk writes and set up suggestions
Posted by Walter Underwood <wu...@wunderwood.org>.
Or, don't "optimize" (force merge) at all. Really. This is a manual override for an automatic process, merging.
I can only think of one case where a forced merge makes sense:
1. All documents are reindexed.
2. Traditional Solr replication is used (not SolrCloud).
3. Replication is manually timed to be after the forced merge.
For incremental indexing with replication, a forced merge will cause a huge amount of disk and network traffic.
wunder
Former search guy for Netflix
Current search guy for Chegg
On Nov 4, 2012, at 12:02 PM, tictacs wrote:
> Thanks for the reply both and apologies if this is a recurring question.
> From the sounds of it I am sure an optimize overnight when app traffic is
> low will suffice. This will massively help with server perfomance I am
> sure.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Solr-Disk-writes-and-set-up-suggestions-tp4018031p4018134.html
> Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Disk writes and set up suggestions
Posted by Walter Underwood <wu...@wunderwood.org>.
Yes. I can guarantee that a force merge will not "massively help". It might not even measurably help.
wunder
On Nov 4, 2012, at 1:05 PM, Otis Gospodnetic wrote:
> Measure / monitor first :)
> You may not need to optimize at all, especially if your index is always
> being modified.
>
> Otis
> --
> Performance Monitoring - http://sematext.com/spm
> On Nov 4, 2012 3:03 PM, "tictacs" <he...@tactics.co.uk> wrote:
>
>> Thanks for the reply both and apologies if this is a recurring question.
>> From the sounds of it I am sure an optimize overnight when app traffic is
>> low will suffice. This will massively help with server perfomance I am
>> sure.
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Solr-Disk-writes-and-set-up-suggestions-tp4018031p4018134.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
--
Walter Underwood
wunder@wunderwood.org
RE: Solr - Disk writes and set up suggestions
Posted by Otis Gospodnetic <ot...@gmail.com>.
Measure / monitor first :)
You may not need to optimize at all, especially if your index is always
being modified.
Otis
--
Performance Monitoring - http://sematext.com/spm
On Nov 4, 2012 3:03 PM, "tictacs" <he...@tactics.co.uk> wrote:
> Thanks for the reply both and apologies if this is a recurring question.
> From the sounds of it I am sure an optimize overnight when app traffic is
> low will suffice. This will massively help with server perfomance I am
> sure.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Disk-writes-and-set-up-suggestions-tp4018031p4018134.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
RE: Solr - Disk writes and set up suggestions
Posted by tictacs <he...@tactics.co.uk>.
Thanks for the reply both and apologies if this is a recurring question.
>From the sounds of it I am sure an optimize overnight when app traffic is
low will suffice. This will massively help with server perfomance I am
sure.
--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Disk-writes-and-set-up-suggestions-tp4018031p4018134.html
Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr - Disk writes and set up suggestions
Posted by Michael Ryan <mr...@moreover.com>.
I'd recommend not optimizing every hour. Are you seeing a significant performance increase from optimizing this frequently?
-Michael