You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ravi Solr <ra...@gmail.com> on 2013/03/25 16:27:17 UTC

Query slow with termVectors termPositions termOffsets

Hello,
        We re-indexed our entire core of 1157777 docs with some of the
fields having termVectors="true" termPositions="true" termOffsets="true",
prior to the reindex we only had termVectors="true". After the reindex the
the query component has become very slow. I thought that adding the
termOffsets and termPositions will increase the speed, am I wrong ? Several
queries like the one shown below which used to run fine are now very slow.
Can somebody kindly clarify how termOffsets and termPositions affect query
component ?

<lst name="process"><double name="time">19076.0</double>
 <lst name="org.apache.solr.handler.component.QueryComponent"><double
name="time">18972.0</double></lst>
<lst name="org.apache.solr.handler.component.FacetComponent"><double
name="time">0.0</double></lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent"><double
name="time">0.0</double></lst>
<lst name="org.apache.solr.handler.component.HighlightComponent"><double
name="time">0.0</double></lst>
<lst name="org.apache.solr.handler.component.StatsComponent"><double
name="time">0.0</double></lst>
<lst
name="org.apache.solr.handler.component.QueryElevationComponent"><double
name="time">0.0</double></lst>
<lst name="org.apache.solr.handler.clustering.ClusteringComponent"><double
name="time">0.0</double></lst>
<lst name="org.apache.solr.handler.component.DebugComponent"><double
name="time">104.0</double></lst>
</lst>


[#|2013-03-25T11:22:53.446-0400|INFO|sun-appserver2.1|org.apache.solr.core.SolrCore|_ThreadID=45;_ThreadName=httpSSLWorkerThread-9001-19;|[xxx]
webapp=/solr-admin path=/select
params={q=primarysectionnode:(/national*+OR+/health*)+OR+(contenttype:Blog+AND+subheadline:("The+Checkup"+OR+"Checkpoint+Washington"+OR+"Post+Carbon"+OR+TSA+OR+"College+Inc."+OR+"Campus+Overload"+OR+"Planet+Panel"+OR+"The+Answer+Sheet"+OR+"Class+Struggle"+OR+"BlogPost"))+OR+(contenttype:"Photo+Gallery"+AND+headline:"day+in+photos")&start=0&rows=1&sort=displaydatetime+desc&fq=-source:(Reuters+OR+"PC+World"+OR+"CBS+News"+OR+NC8/WJLA+OR+"NewsChannel+8"+OR+NC8+OR+WJLA+OR+CBS)+-contenttype:("Discussion"+OR+"Photo")+-slug:(op-*dummy*+OR+noipad-*)+-(contenttype:"Photo+Gallery"+AND+headline:("Drawing+Board"+OR+"Drawing+board"+OR+"drawing+board"))+headline:[*+TO+*]+contenttype:[*+TO+*]+pubdatetime:[NOW/DAY-3YEARS+TO+NOW/DAY%2B1DAY]+-headline:("Summary+Box*"+OR+"Video*"+OR+"Post+Sports+Live*")+-slug:(warren*+OR+"history")+-(contenttype:Blog+AND+subheadline:("DC+Schools+Insider"+OR+"On+Leadership"))+contenttype:"Blog"+-systemid:(999c7102-955a-11e2-95ca-dd43e7ffee9c+OR+72bbb724-9554-11e2-95ca-dd43e7ffee9c+OR+2d008b80-9520-11e2-95ca-dd43e7ffee9c+OR+d2443d3c-9514-11e2-95ca-dd43e7ffee9c+OR+173764d6-9520-11e2-95ca-dd43e7ffee9c+OR+0181fd42-953c-11e2-95ca-dd43e7ffee9c+OR+e6cacb96-9559-11e2-95ca-dd43e7ffee9c+OR+03288052-9501-11e2-95ca-dd43e7ffee9c+OR+ddbf020c-9517-11e2-95ca-dd43e7ffee9c)+fullbody:[*+TO+*]&wt=javabin&version=2}
hits=4985 status=0 QTime=19044 |#]

Thanks,

Ravi Kiran Bhaskar

Re: Query slow with termVectors termPositions termOffsets

Posted by Ravi Solr <ra...@gmail.com>.
Yes the index size increased after turning on termPositions and termOffsets

Ravi Kiran Bhaskar

On Mon, Mar 25, 2013 at 1:13 PM, <al...@aim.com> wrote:

> Did index size increase after turning on termPositions and termOffsets?
>
> Thanks.
> Alex.
>
>
>
>
>
>
>
> -----Original Message-----
> From: Ravi Solr <ra...@gmail.com>
> To: solr-user <so...@lucene.apache.org>
> Sent: Mon, Mar 25, 2013 8:27 am
> Subject: Query slow with termVectors termPositions termOffsets
>
>
> Hello,
>         We re-indexed our entire core of 1157777 docs with some of the
> fields having termVectors="true" termPositions="true" termOffsets="true",
> prior to the reindex we only had termVectors="true". After the reindex the
> the query component has become very slow. I thought that adding the
> termOffsets and termPositions will increase the speed, am I wrong ? Several
> queries like the one shown below which used to run fine are now very slow.
> Can somebody kindly clarify how termOffsets and termPositions affect query
> component ?
>
> <lst name="process"><double name="time">19076.0</double>
>  <lst name="org.apache.solr.handler.component.QueryComponent"><double
> name="time">18972.0</double></lst>
> <lst name="org.apache.solr.handler.component.FacetComponent"><double
> name="time">0.0</double></lst>
> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent"><double
> name="time">0.0</double></lst>
> <lst name="org.apache.solr.handler.component.HighlightComponent"><double
> name="time">0.0</double></lst>
> <lst name="org.apache.solr.handler.component.StatsComponent"><double
> name="time">0.0</double></lst>
> <lst
> name="org.apache.solr.handler.component.QueryElevationComponent"><double
> name="time">0.0</double></lst>
> <lst name="org.apache.solr.handler.clustering.ClusteringComponent"><double
> name="time">0.0</double></lst>
> <lst name="org.apache.solr.handler.component.DebugComponent"><double
> name="time">104.0</double></lst>
> </lst>
>
>
>
> [#|2013-03-25T11:22:53.446-0400|INFO|sun-appserver2.1|org.apache.solr.core.SolrCore|_ThreadID=45;_ThreadName=httpSSLWorkerThread-9001-19;|[xxx]
> webapp=/solr-admin path=/select
>
> params={q=primarysectionnode:(/national*+OR+/health*)+OR+(contenttype:Blog+AND+subheadline:("The+Checkup"+OR+"Checkpoint+Washington"+OR+"Post+Carbon"+OR+TSA+OR+"College+Inc."+OR+"Campus+Overload"+OR+"Planet+Panel"+OR+"The+Answer+Sheet"+OR+"Class+Struggle"+OR+"BlogPost"))+OR+(contenttype:"Photo+Gallery"+AND+headline:"day+in+photos")&start=0&rows=1&sort=displaydatetime+desc&fq=-source:(Reuters+OR+"PC+World"+OR+"CBS+News"+OR+NC8/WJLA+OR+"NewsChannel+8"+OR+NC8+OR+WJLA+OR+CBS)+-contenttype:("Discussion"+OR+"Photo")+-slug:(op-*dummy*+OR+noipad-*)+-(contenttype:"Photo+Gallery"+AND+headline:("Drawing+Board"+OR+"Drawing+board"+OR+"drawing+board"))+headline:[*+TO+*]+contenttype:[*+TO+*]+pubdatetime:[NOW/DAY-3YEARS+TO+NOW/DAY%2B1DAY]+-headline:("Summary+Box*"+OR+"Video*"+OR+"Post+Sports+Live*")+-slug:(warren*+OR+"history")+-(contenttype:Blog+AND+subheadline:("DC+Schools+Insider"+OR+"On+Leadership"))+contenttype:"Blog"+-systemid:(999c7102-955a-11e2-95ca-dd43e7ffee9c+OR+72bbb724-9554-11e2-95ca-dd43e7ffee9c+OR+2d008b80-9520-11e2-95ca-dd43e7ffee9c+OR+d2443d3c-9514-11e2-95ca-dd43e7ffee9c+OR+173764d6-9520-11e2-95ca-dd43e7ffee9c+OR+0181fd42-953c-11e2-95ca-dd43e7ffee9c+OR+e6cacb96-9559-11e2-95ca-dd43e7ffee9c+OR+03288052-9501-11e2-95ca-dd43e7ffee9c+OR+ddbf020c-9517-11e2-95ca-dd43e7ffee9c)+fullbody:[*+TO+*]&wt=javabin&version=2}
> hits=4985 status=0 QTime=19044 |#]
>
> Thanks,
>
> Ravi Kiran Bhaskar
>
>
>

Re: Query slow with termVectors termPositions termOffsets

Posted by al...@aim.com.
Did index size increase after turning on termPositions and termOffsets?

Thanks.
Alex.

 

 

 

-----Original Message-----
From: Ravi Solr <ra...@gmail.com>
To: solr-user <so...@lucene.apache.org>
Sent: Mon, Mar 25, 2013 8:27 am
Subject: Query slow with termVectors termPositions termOffsets


Hello,
        We re-indexed our entire core of 1157777 docs with some of the
fields having termVectors="true" termPositions="true" termOffsets="true",
prior to the reindex we only had termVectors="true". After the reindex the
the query component has become very slow. I thought that adding the
termOffsets and termPositions will increase the speed, am I wrong ? Several
queries like the one shown below which used to run fine are now very slow.
Can somebody kindly clarify how termOffsets and termPositions affect query
component ?

<lst name="process"><double name="time">19076.0</double>
 <lst name="org.apache.solr.handler.component.QueryComponent"><double
name="time">18972.0</double></lst>
<lst name="org.apache.solr.handler.component.FacetComponent"><double
name="time">0.0</double></lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent"><double
name="time">0.0</double></lst>
<lst name="org.apache.solr.handler.component.HighlightComponent"><double
name="time">0.0</double></lst>
<lst name="org.apache.solr.handler.component.StatsComponent"><double
name="time">0.0</double></lst>
<lst
name="org.apache.solr.handler.component.QueryElevationComponent"><double
name="time">0.0</double></lst>
<lst name="org.apache.solr.handler.clustering.ClusteringComponent"><double
name="time">0.0</double></lst>
<lst name="org.apache.solr.handler.component.DebugComponent"><double
name="time">104.0</double></lst>
</lst>


[#|2013-03-25T11:22:53.446-0400|INFO|sun-appserver2.1|org.apache.solr.core.SolrCore|_ThreadID=45;_ThreadName=httpSSLWorkerThread-9001-19;|[xxx]
webapp=/solr-admin path=/select
params={q=primarysectionnode:(/national*+OR+/health*)+OR+(contenttype:Blog+AND+subheadline:("The+Checkup"+OR+"Checkpoint+Washington"+OR+"Post+Carbon"+OR+TSA+OR+"College+Inc."+OR+"Campus+Overload"+OR+"Planet+Panel"+OR+"The+Answer+Sheet"+OR+"Class+Struggle"+OR+"BlogPost"))+OR+(contenttype:"Photo+Gallery"+AND+headline:"day+in+photos")&start=0&rows=1&sort=displaydatetime+desc&fq=-source:(Reuters+OR+"PC+World"+OR+"CBS+News"+OR+NC8/WJLA+OR+"NewsChannel+8"+OR+NC8+OR+WJLA+OR+CBS)+-contenttype:("Discussion"+OR+"Photo")+-slug:(op-*dummy*+OR+noipad-*)+-(contenttype:"Photo+Gallery"+AND+headline:("Drawing+Board"+OR+"Drawing+board"+OR+"drawing+board"))+headline:[*+TO+*]+contenttype:[*+TO+*]+pubdatetime:[NOW/DAY-3YEARS+TO+NOW/DAY%2B1DAY]+-headline:("Summary+Box*"+OR+"Video*"+OR+"Post+Sports+Live*")+-slug:(warren*+OR+"history")+-(contenttype:Blog+AND+subheadline:("DC+Schools+Insider"+OR+"On+Leadership"))+contenttype:"Blog"+-systemid:(999c7102-955a-11e2-95ca-dd43e7ffee9c+OR+72bbb724-9554-11e2-95ca-dd43e7ffee9c+OR+2d008b80-9520-11e2-95ca-dd43e7ffee9c+OR+d2443d3c-9514-11e2-95ca-dd43e7ffee9c+OR+173764d6-9520-11e2-95ca-dd43e7ffee9c+OR+0181fd42-953c-11e2-95ca-dd43e7ffee9c+OR+e6cacb96-9559-11e2-95ca-dd43e7ffee9c+OR+03288052-9501-11e2-95ca-dd43e7ffee9c+OR+ddbf020c-9517-11e2-95ca-dd43e7ffee9c)+fullbody:[*+TO+*]&wt=javabin&version=2}
hits=4985 status=0 QTime=19044 |#]

Thanks,

Ravi Kiran Bhaskar