You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by oleg_gnatovskiy <ol...@citysearch.com> on 2009/01/21 00:19:46 UTC

Re: Query Performance while updating teh index

Hello again. It seems that we are still having these problems. Queries take
as long as 20 minutes to get back to their average response time after a
large index update, so it doesn't seem like the problem is the 12 second
autowarm time. Are there any more suggestions for things we can try? Taking
our servers out of teh loop for as long as 20 minutes is a bit of a hassle,
and a risk.
-- 
View this message in context: http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Performance while updating teh index

Posted by Chris Hostetter <ho...@fucit.org>.
: Just to clarify - we do not optimize on the slaves at all. We only optimize
: on the master.

that doesn't change anything about hte comments that i made before.  it 
*really* wouldn't make sense to optimize on a slave right before pulling a 
new snapshot, but it still doesn't make any more sense to optimize on a 
master right before doing some updates and then pulling a new snapshot.  
my second comment also still applies: a snappull after an optimize is 
always going to be involve more churn on the disk...

: > : We do optimize the index before updates but we get tehse performance
: > issues
: > : even when we pull an empty snapshot. Thus even when our update is tiny,
: > the
: > : performance issues still happen.
: > 
: > FWIW: this behavior doesn't make a lot of sense -- optimizing just 
: > before you are about to make updates/additions ot your data, is a complete 
: > waste.  the main value in optimizing your index is that you have one 
: > segment, as soon as you add a docment that changes.
: > 
: > the other thing to keep in mind is that an optimized index is a completley 
: > new segment as a new file with a new name, so there is going to be added 
: > overhead on the slave machines as the OS purges the old index files and 
: > replaces them with the new optimized index files -- more overhead then if 
: > you had just done your additions w/o optimizing first.



-Hoss


Re: Query Performance while updating teh index

Posted by oleg_gnatovskiy <ol...@citysearch.com>.
Just to clarify - we do not optimize on the slaves at all. We only optimize
on the master.

hossman wrote:
> 
> 
> : We do optimize the index before updates but we get tehse performance
> issues
> : even when we pull an empty snapshot. Thus even when our update is tiny,
> the
> : performance issues still happen.
> 
> FWIW: this behavior doesn't make a lot of sense -- optimizing just 
> before you are about to make updates/additions ot your data, is a complete 
> waste.  the main value in optimizing your index is that you have one 
> segment, as soon as you add a docment that changes.
> 
> the other thing to keep in mind is that an optimized index is a completley 
> new segment as a new file with a new name, so there is going to be added 
> overhead on the slave machines as the OS purges the old index files and 
> replaces them with the new optimized index files -- more overhead then if 
> you had just done your additions w/o optimizing first.
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21678267.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Performance while updating teh index

Posted by oleg_gnatovskiy <ol...@citysearch.com>.
Just to calrify - we do not optimize on teh slaves at all. We only optimize
on the master.

hossman wrote:
> 
> 
> : We do optimize the index before updates but we get tehse performance
> issues
> : even when we pull an empty snapshot. Thus even when our update is tiny,
> the
> : performance issues still happen.
> 
> FWIW: this behavior doesn't make a lot of sense -- optimizing just 
> before you are about to make updates/additions ot your data, is a complete 
> waste.  the main value in optimizing your index is that you have one 
> segment, as soon as you add a docment that changes.
> 
> the other thing to keep in mind is that an optimized index is a completley 
> new segment as a new file with a new name, so there is going to be added 
> overhead on the slave machines as the OS purges the old index files and 
> replaces them with the new optimized index files -- more overhead then if 
> you had just done your additions w/o optimizing first.
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21678261.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Performance while updating teh index

Posted by oleg_gnatovskiy <ol...@citysearch.com>.
Just to calrify - we do not optimize on the slaves at all. We only optimize
on the master.

hossman wrote:
> 
> 
> : We do optimize the index before updates but we get tehse performance
> issues
> : even when we pull an empty snapshot. Thus even when our update is tiny,
> the
> : performance issues still happen.
> 
> FWIW: this behavior doesn't make a lot of sense -- optimizing just 
> before you are about to make updates/additions ot your data, is a complete 
> waste.  the main value in optimizing your index is that you have one 
> segment, as soon as you add a docment that changes.
> 
> the other thing to keep in mind is that an optimized index is a completley 
> new segment as a new file with a new name, so there is going to be added 
> overhead on the slave machines as the OS purges the old index files and 
> replaces them with the new optimized index files -- more overhead then if 
> you had just done your additions w/o optimizing first.
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21678265.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Performance while updating teh index

Posted by Chris Hostetter <ho...@fucit.org>.
: We do optimize the index before updates but we get tehse performance issues
: even when we pull an empty snapshot. Thus even when our update is tiny, the
: performance issues still happen.

FWIW: this behavior doesn't make a lot of sense -- optimizing just 
before you are about to make updates/additions ot your data, is a complete 
waste.  the main value in optimizing your index is that you have one 
segment, as soon as you add a docment that changes.

the other thing to keep in mind is that an optimized index is a completley 
new segment as a new file with a new name, so there is going to be added 
overhead on the slave machines as the OS purges the old index files and 
replaces them with the new optimized index files -- more overhead then if 
you had just done your additions w/o optimizing first.



-Hoss


Re: Query Performance while updating teh index

Posted by oleg_gnatovskiy <ol...@citysearch.com>.
We've tried it. There doesn't seem to be any connection between GC and the
bad performance spikes.


Otis Gospodnetic wrote:
> 
> OK.  Then it's likely not this.  You saw the other response about looking
> at GC to see if maybe that hits you once in a while and slows whatever
> queries are in flight?  Try jconsole.
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> ----- Original Message ----
>> From: oleg_gnatovskiy <ol...@citysearch.com>
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, January 22, 2009 2:43:31 PM
>> Subject: Re: Query Performance while updating teh index
>> 
>> 
>> We do optimize the index before updates but we get tehse performance
>> issues
>> even when we pull an empty snapshot. Thus even when our update is tiny,
>> the
>> performance issues still happen.
>> 
>> 
>> 
>> Otis Gospodnetic wrote:
>> > 
>> > This is an old and long thread, and I no longer recall what the
>> specific
>> > suggestions were.
>> > My guess is this has to do with the OS cache of your index files.  When
>> > you make the large index update, that OS cache is useless (old files
>> are
>> > gone, new ones are in) and the OS cache has get re-warmed and this
>> takes
>> > time.
>> > 
>> > Are you optimizing your index before the update?  Do you *really* need
>> to
>> > do that?
>> > How large is your update, what makes it big, and could you make it
>> > smaller?
>> > 
>> > Otis
>> > --
>> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> > 
>> > 
>> > 
>> > ----- Original Message ----
>> >> From: oleg_gnatovskiy 
>> >> To: solr-user@lucene.apache.org
>> >> Sent: Tuesday, January 20, 2009 6:19:46 PM
>> >> Subject: Re: Query Performance while updating teh index
>> >> 
>> >> 
>> >> Hello again. It seems that we are still having these problems. Queries
>> >> take
>> >> as long as 20 minutes to get back to their average response time after
>> a
>> >> large index update, so it doesn't seem like the problem is the 12
>> second
>> >> autowarm time. Are there any more suggestions for things we can try?
>> >> Taking
>> >> our servers out of teh loop for as long as 20 minutes is a bit of a
>> >> hassle,
>> >> and a risk.
>> >> -- 
>> >> View this message in context: 
>> >> 
>> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> > 
>> > 
>> > 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21611642.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21611976.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Performance while updating teh index

Posted by Otis Gospodnetic <ot...@yahoo.com>.
OK.  Then it's likely not this.  You saw the other response about looking at GC to see if maybe that hits you once in a while and slows whatever queries are in flight?  Try jconsole.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: oleg_gnatovskiy <ol...@citysearch.com>
> To: solr-user@lucene.apache.org
> Sent: Thursday, January 22, 2009 2:43:31 PM
> Subject: Re: Query Performance while updating teh index
> 
> 
> We do optimize the index before updates but we get tehse performance issues
> even when we pull an empty snapshot. Thus even when our update is tiny, the
> performance issues still happen.
> 
> 
> 
> Otis Gospodnetic wrote:
> > 
> > This is an old and long thread, and I no longer recall what the specific
> > suggestions were.
> > My guess is this has to do with the OS cache of your index files.  When
> > you make the large index update, that OS cache is useless (old files are
> > gone, new ones are in) and the OS cache has get re-warmed and this takes
> > time.
> > 
> > Are you optimizing your index before the update?  Do you *really* need to
> > do that?
> > How large is your update, what makes it big, and could you make it
> > smaller?
> > 
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > 
> > 
> > 
> > ----- Original Message ----
> >> From: oleg_gnatovskiy 
> >> To: solr-user@lucene.apache.org
> >> Sent: Tuesday, January 20, 2009 6:19:46 PM
> >> Subject: Re: Query Performance while updating teh index
> >> 
> >> 
> >> Hello again. It seems that we are still having these problems. Queries
> >> take
> >> as long as 20 minutes to get back to their average response time after a
> >> large index update, so it doesn't seem like the problem is the 12 second
> >> autowarm time. Are there any more suggestions for things we can try?
> >> Taking
> >> our servers out of teh loop for as long as 20 minutes is a bit of a
> >> hassle,
> >> and a risk.
> >> -- 
> >> View this message in context: 
> >> 
> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> > 
> > 
> > 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21611642.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Performance while updating teh index

Posted by oleg_gnatovskiy <ol...@citysearch.com>.
We do optimize the index before updates but we get tehse performance issues
even when we pull an empty snapshot. Thus even when our update is tiny, the
performance issues still happen.



Otis Gospodnetic wrote:
> 
> This is an old and long thread, and I no longer recall what the specific
> suggestions were.
> My guess is this has to do with the OS cache of your index files.  When
> you make the large index update, that OS cache is useless (old files are
> gone, new ones are in) and the OS cache has get re-warmed and this takes
> time.
> 
> Are you optimizing your index before the update?  Do you *really* need to
> do that?
> How large is your update, what makes it big, and could you make it
> smaller?
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> ----- Original Message ----
>> From: oleg_gnatovskiy <ol...@citysearch.com>
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, January 20, 2009 6:19:46 PM
>> Subject: Re: Query Performance while updating teh index
>> 
>> 
>> Hello again. It seems that we are still having these problems. Queries
>> take
>> as long as 20 minutes to get back to their average response time after a
>> large index update, so it doesn't seem like the problem is the 12 second
>> autowarm time. Are there any more suggestions for things we can try?
>> Taking
>> our servers out of teh loop for as long as 20 minutes is a bit of a
>> hassle,
>> and a risk.
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21611642.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Performance while updating teh index

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Oleg,

This is more of an OS-level thing that Solr-thing, it seems from your emails.  If you send answers to my questions we'll be able to help more.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: oleg_gnatovskiy <ol...@citysearch.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, January 21, 2009 1:09:21 PM
> Subject: Re: Query Performance while updating teh index
> 
> 
> What exactly does Solr do when it receives a new Index? How does it keep
> serving while performing the updates? It seems that the part that causes the
> slowdown is this transition.
> 
> 
> 
> 
> Otis Gospodnetic wrote:
> > 
> > This is an old and long thread, and I no longer recall what the specific
> > suggestions were.
> > My guess is this has to do with the OS cache of your index files.  When
> > you make the large index update, that OS cache is useless (old files are
> > gone, new ones are in) and the OS cache has get re-warmed and this takes
> > time.
> > 
> > Are you optimizing your index before the update?  Do you *really* need to
> > do that?
> > How large is your update, what makes it big, and could you make it
> > smaller?
> > 
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > 
> > 
> > 
> > ----- Original Message ----
> >> From: oleg_gnatovskiy 
> >> To: solr-user@lucene.apache.org
> >> Sent: Tuesday, January 20, 2009 6:19:46 PM
> >> Subject: Re: Query Performance while updating teh index
> >> 
> >> 
> >> Hello again. It seems that we are still having these problems. Queries
> >> take
> >> as long as 20 minutes to get back to their average response time after a
> >> large index update, so it doesn't seem like the problem is the 12 second
> >> autowarm time. Are there any more suggestions for things we can try?
> >> Taking
> >> our servers out of teh loop for as long as 20 minutes is a bit of a
> >> hassle,
> >> and a risk.
> >> -- 
> >> View this message in context: 
> >> 
> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> > 
> > 
> > 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21588779.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Performance while updating teh index

Posted by oleg_gnatovskiy <ol...@citysearch.com>.
What exactly does Solr do when it receives a new Index? How does it keep
serving while performing the updates? It seems that the part that causes the
slowdown is this transition.




Otis Gospodnetic wrote:
> 
> This is an old and long thread, and I no longer recall what the specific
> suggestions were.
> My guess is this has to do with the OS cache of your index files.  When
> you make the large index update, that OS cache is useless (old files are
> gone, new ones are in) and the OS cache has get re-warmed and this takes
> time.
> 
> Are you optimizing your index before the update?  Do you *really* need to
> do that?
> How large is your update, what makes it big, and could you make it
> smaller?
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> ----- Original Message ----
>> From: oleg_gnatovskiy <ol...@citysearch.com>
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, January 20, 2009 6:19:46 PM
>> Subject: Re: Query Performance while updating teh index
>> 
>> 
>> Hello again. It seems that we are still having these problems. Queries
>> take
>> as long as 20 minutes to get back to their average response time after a
>> large index update, so it doesn't seem like the problem is the 12 second
>> autowarm time. Are there any more suggestions for things we can try?
>> Taking
>> our servers out of teh loop for as long as 20 minutes is a bit of a
>> hassle,
>> and a risk.
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21588779.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Performance while updating teh index

Posted by Otis Gospodnetic <ot...@yahoo.com>.
This is an old and long thread, and I no longer recall what the specific suggestions were.
My guess is this has to do with the OS cache of your index files.  When you make the large index update, that OS cache is useless (old files are gone, new ones are in) and the OS cache has get re-warmed and this takes time.

Are you optimizing your index before the update?  Do you *really* need to do that?
How large is your update, what makes it big, and could you make it smaller?

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: oleg_gnatovskiy <ol...@citysearch.com>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, January 20, 2009 6:19:46 PM
> Subject: Re: Query Performance while updating teh index
> 
> 
> Hello again. It seems that we are still having these problems. Queries take
> as long as 20 minutes to get back to their average response time after a
> large index update, so it doesn't seem like the problem is the 12 second
> autowarm time. Are there any more suggestions for things we can try? Taking
> our servers out of teh loop for as long as 20 minutes is a bit of a hassle,
> and a risk.
> -- 
> View this message in context: 
> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
> Sent from the Solr - User mailing list archive at Nabble.com.