You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by hbi dev <hb...@googlemail.com> on 2009/01/22 12:39:39 UTC

Intermittent high response times

Hi all,
I have an implmentation of solr (rev.708837) running on tomcat 6.

Approx 600,000 docs, 2 fairly content heavy text fields, between 4 and 7
facets (depending on what our front end is requesting, and mostly low unique
values)

1GB of memory allocated, generally I do not see it using all of that up.

For the most part my response times are under 200ms, but I randomly get
times that are around 100,000ms!

Original load testing didn't reveal this, I can see from the logs we are
getting approx 20 requests per second so it's not really under much load at
the moment.

Does anyone have any pointers that I can follow or look into?
Please ask if I need to provide any more info.

Thanks in advance

Regards,
Waseem

Re: Intermittent high response times

Posted by wojtekpia <wo...@hotmail.com>.
The type of garbage collector definitely affects performance, but there are
other settings as well. There's a related thread currently discussing this:
http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-td21588427.html



hbi dev wrote:
> 
> Hi wojtekpia,
> 
> That's interesting, I shall be looking into this over the weekend so I
> shall
> look at the GC also. I was briefly reading about GC last night, am I right
> in thinking it could be affected by what version of the jvm I'm using
> (1.5.0.8), and also what type of Collector is set? What collector is the
> default, and what would people recommend for an application like Solr?
> Thanks
> Waseem
> 

-- 
View this message in context: http://www.nabble.com/Intermittent-high-response-times-tp21602475p21628769.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Intermittent high response times

Posted by hbi dev <hb...@googlemail.com>.
Hi wojtekpia,

That's interesting, I shall be looking into this over the weekend so I shall
look at the GC also. I was briefly reading about GC last night, am I right
in thinking it could be affected by what version of the jvm I'm using
(1.5.0.8), and also what type of Collector is set? What collector is the
default, and what would people recommend for an application like Solr?
Thanks
Waseem

On Thu, Jan 22, 2009 at 5:24 PM, wojtekpia <wo...@hotmail.com> wrote:

>
> I'm experiencing similar issues. Mine seem to be related to old generation
> garbage collection. Can you monitor your garbage collection activity? (I'm
> using JConsole to monitor it:
> http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html).
>
> In my system, garbage collection usually doesn't cause any trouble. But
> once
> in a while, the size of the old generation flat-lines for some time
> (~dozens
> of seconds). When this happens, I see really bad response times from Solr
> (not quite as bad as you're seeing, but almost). The old-gen flat-lines
> always seem to be right before, or right after the old-gen is garbage
> collected.
> --
> View this message in context:
> http://www.nabble.com/Intermittent-high-response-times-tp21602475p21608986.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Intermittent high response times

Posted by wojtekpia <wo...@hotmail.com>.
I'm experiencing similar issues. Mine seem to be related to old generation
garbage collection. Can you monitor your garbage collection activity? (I'm
using JConsole to monitor it:
http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html). 

In my system, garbage collection usually doesn't cause any trouble. But once
in a while, the size of the old generation flat-lines for some time (~dozens
of seconds). When this happens, I see really bad response times from Solr
(not quite as bad as you're seeing, but almost). The old-gen flat-lines
always seem to be right before, or right after the old-gen is garbage
collected.
-- 
View this message in context: http://www.nabble.com/Intermittent-high-response-times-tp21602475p21608986.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Intermittent high response times

Posted by hbi dev <hb...@googlemail.com>.
Hi,
The criteria rarely varies from others that are much quicker, maybe only
what the start row is. Most of the time the main "terms" are a single word
or just a "blank query" (q.alt=*:*)

My request handler does have a lot of predefined filters, this is included
below. Most of this is auto-warmed.
The server also does updates via the DataImportHandler every 5 minutes.
Optimisation is only performed once a day at approximately midnight. These
high response times can happen at any time of day, mostly out of working
hours, which is also when we have the least number of updates + search
traffic.

In terms of CPU and IO usage, as mentioned above they are mostly out of
hours so I will see if our server admins have setup some SNMP tools to
provide reports for me. Looking at the server right now I can see between 2
and 10% CPU


Here is a small extract from the log:

21-Jan-2009 19:45:39 org.apache.solr.core.SolrCore execute
INFO: [news] webapp=/solr path=/select
params={rows=10&start=40&sort=score+desc,+newsArticleDate_Date+desc,+newsCalculatedImportance+desc&fq=newsArticleDate_Year:1995&hl=true&qt=BR2News}
hits=1106 status=0 QTime=31
21-Jan-2009 19:45:39 org.apache.solr.core.SolrCore execute
INFO: [news] webapp=/solr path=/select
params={rows=10&start=100&sort=score+desc,+newsArticleDate_Date+desc,+newsCalculatedImportance+desc&fq=newsArticleDate_Year:1996&hl=true&qt=BR2News}
hits=8345 status=0 QTime=119234


The request handler is:

<requestHandler name="BR2News" class="solr.DisMaxRequestHandler" >
    <lst name="defaults">
    <int name="rows">10</int>
    <str name="hl">false</str>
<str name="sort">score desc, newsArticleDate_Date desc,
newsCalculatedImportance desc</str>
 <str name="f.newsArticleDate_Year.facet.sort">false</str>
<str name="f.newsArticleDate_Month.facet.sort">false</str>
<str name="f.newsArticleDate_Day.facet.sort">false</str>
 <str name="facet.mincount">1</str>
<str name="mlt">false</str>
<str name="wt">xslt</str>
<str name="tr">newsResults.xsl</str>
    </lst>
<lst name="appends">
  <str name="fq">news_magJournalCode:BR2 OR news_magJournalCode:CAM OR
news_magJournalCode:CEI OR news_magJournalCode:CIT OR
news_magJournalCode:DRN OR news_magJournalCode:EVE OR
news_magJournalCode:MKT OR news_magJournalCode:MXD OR
news_magJournalCode:PRA OR news_magJournalCode:PRI OR
news_magJournalCode:PRS OR news_magJournalCode:PRW OR
news_magJournalCode:REV OR news_magJournalCode:RSV OR
news_magJournalCode:WWP OR news_magJournalCode:XMB OR
news_magJournalCode:XMW OR news_magJournalCode:XX6</str>
      <str name="fq">newsStatus:true</str>
      <str name="fq">newsPublishedDate_Date:[* TO NOW/DAY]</str>
    <str name="fq">newsArticleDate_Date:[* TO NOW/DAY]</str>
    </lst>
    <lst name="invariants">
     <!-- mm=1 ONLY TOUCH THIS IF YOU REALLY KNOW WHAT YOU ARE DOING!!!!!
-->
<str name="mm">1</str>
     <!-- mm=1 ONLY TOUCH THIS IF YOU REALLY KNOW WHAT YOU ARE DOING!!!!!
-->
 <str name="hl.simple.pre"><![CDATA[<span class="hiLite">]]></str>
<str name="hl.simple.post"><![CDATA[</span>]]></str>
<str name="f.newsBody.hl.snippets">3</str>
<str name="f.newsBody.hl.mergeContiguous">true</str>
<str name="q.alt">*:*</str>
<str name="echoParams">all</str>
<float name="tie">0.01</float>
<str
name="fl">newsID,newsTitle,newsTitleAlternate,newsAuthor,news_magJournalCode,news_newsTypeID,newsStatus,magName,newsSeoURLTitle,newsSummary,newsBody,newsSummaryAlternate,newsArticleDate_DateTime,newsDateAdded_DateTime,newsAuthor,score</str>
<str name="version">2.2</str>
<str name="qf">newsTitle newsSummary^0.75 newsBody^0.5 newsAuthor^0.1</str>
<!-- no "exact" author field as this is already indexed appropriately -->
<str name="pf">newsTitleExact newsSummaryExact newsBodyExact
newsAuthor</str>
<str name="ps">1</str>
<str name="hl.fl">newsTitle newsBody newsSummary newsAuthor</str>
<str name="mlt.fl">newsTitle newsBody newsSummary newsAuthor</str>
 <str name="facet">true</str>
<str name="facet.field">news_magJournalCode_FacetDetails</str>
<str name="facet.field">news_newsTypeID_FacetDetails</str>
<str name="facet.field">sector_FacetDetails</str>
<str name="facet.field">discipline_FacetDetails</str>
<str name="facet.field">asset_FacetDetails</str>
<!-- get facet for "today" -->
<str name="facet.query">newsArticleDate_Date:[NOW/DAY TO NOW/DAY]</str>
<!-- get facet for "lastweek" (last 7 days) -->
<str name="facet.query">newsArticleDate_Date:[NOW/DAY-7DAYS TO
NOW/DAY]</str>
<!-- get facet for "lastmonth" -->
<str name="facet.query">newsArticleDate_Date:[NOW/DAY-1MONTH TO
NOW/DAY]</str>
<str name="facet.field">newsArticleDate_Year</str>
<str name="facet.field">newsArticleDate_Month</str>
<str name="facet.field">newsArticleDate_Day</str>
<!-- get facet for "today" -->
<str name="facet.query">newsDateAdded_Date:[NOW/DAY TO NOW/DAY]</str>
<!-- get facet for "lastweek" (last 7 days) -->
<str name="facet.query">newsDateAdded_Date:[NOW/DAY-7DAYS TO NOW/DAY]</str>
<!-- get facet for "lastmonth" -->
<str name="facet.query">newsDateAdded_Date:[NOW/DAY-1MONTH TO NOW/DAY]</str>
<str name="facet.field">newsDateAdded_Year</str>
<str name="facet.field">newsDateAdded_Month</str>
<str name="facet.field">newsDateAdded_Day</str>
    </lst>
  </requestHandler>

On Thu, Jan 22, 2009 at 2:23 PM, Otis Gospodnetic <
otis_gospodnetic@yahoo.com> wrote:

> Hi,
>
> Is there anything special about those queries?  e.g. lots of terms,
> frequent terms, something else?  Is there anything else happening on that
> server when you see such long queries?  Do you see lots of IO or lots of CPU
> being used during those times?
>
>
> Otis --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
> > From: hbi dev <hb...@googlemail.com>
> > To: solr-user@lucene.apache.org
> > Sent: Thursday, January 22, 2009 6:39:39 AM
> > Subject: Intermittent high response times
> >
> > Hi all,
> > I have an implmentation of solr (rev.708837) running on tomcat 6.
> >
> > Approx 600,000 docs, 2 fairly content heavy text fields, between 4 and 7
> > facets (depending on what our front end is requesting, and mostly low
> unique
> > values)
> >
> > 1GB of memory allocated, generally I do not see it using all of that up.
> >
> > For the most part my response times are under 200ms, but I randomly get
> > times that are around 100,000ms!
> >
> > Original load testing didn't reveal this, I can see from the logs we are
> > getting approx 20 requests per second so it's not really under much load
> at
> > the moment.
> >
> > Does anyone have any pointers that I can follow or look into?
> > Please ask if I need to provide any more info.
> >
> > Thanks in advance
> >
> > Regards,
> > Waseem
>
>

Re: Intermittent high response times

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi,

Is there anything special about those queries?  e.g. lots of terms, frequent terms, something else?  Is there anything else happening on that server when you see such long queries?  Do you see lots of IO or lots of CPU being used during those times?


Otis --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: hbi dev <hb...@googlemail.com>
> To: solr-user@lucene.apache.org
> Sent: Thursday, January 22, 2009 6:39:39 AM
> Subject: Intermittent high response times
> 
> Hi all,
> I have an implmentation of solr (rev.708837) running on tomcat 6.
> 
> Approx 600,000 docs, 2 fairly content heavy text fields, between 4 and 7
> facets (depending on what our front end is requesting, and mostly low unique
> values)
> 
> 1GB of memory allocated, generally I do not see it using all of that up.
> 
> For the most part my response times are under 200ms, but I randomly get
> times that are around 100,000ms!
> 
> Original load testing didn't reveal this, I can see from the logs we are
> getting approx 20 requests per second so it's not really under much load at
> the moment.
> 
> Does anyone have any pointers that I can follow or look into?
> Please ask if I need to provide any more info.
> 
> Thanks in advance
> 
> Regards,
> Waseem