You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by MOUSSA MZE Oussama-ext <ou...@pole-emploi.fr> on 2018/02/19 15:47:39 UTC
Facet performance problem
Hi
We have following environement :
3 nodes cluster
1 shard
Replication factor = 2
8GB per node
29 millions of documents
We've faceting over field "motifPresence" defined as follow:
<field name="motifPresence" type="string" docValues="true" indexed="false" stored="true" required="false"/>
Once the user selects motifPresence filter we executes search again with:
fq: (value1 OR value2 OR value3 OR ...)
The problem is: During facet filtering query is too slow and her response time is greater than main search (without facet filtering).
Thanks in advance!
Re: Facet performance problem
Posted by Shawn Heisey <el...@elyograg.org>.
On 2/20/2018 1:18 AM, LOPEZ-CORTES Mariano-ext wrote:
> We return a facet list of values in "motifPresence" field (person status).
> Status:
> [ ] status1
> [x] status2
> [x] status3
>
> The user then selects 1 or multiple status (It's this step that we called "facet filtering").
>
> Query is then re-executed with fq=motifPresence:(status2 OR status3)
>
> We use fq in order to not alter the score in main query.
>
> We've read that docValues=true for facet fields.
>
> We need also indexed=true?
Facets, grouping, and sorting are more efficient with docValues, but
searches aren't helped by docValues. Without indexed="true", searches
on the field will be VERY slow. A filter query is still a search. The
"filter" in filter query just refers to the fact that it's separate from
the main query, and that it does not affect relevancy scoring.
Thanks,
Shawn
RE: Facet performance problem
Posted by LOPEZ-CORTES Mariano-ext <ma...@pole-emploi.fr>.
Our query looks like this:
...factet=true&facet.field=motifPresence
We return a facet list of values in "motifPresence" field (person status).
Status:
[ ] status1
[x] status2
[x] status3
The user then selects 1 or multiple status (It's this step that we called "facet filtering").
Query is then re-executed with fq=motifPresence:(status2 OR status3)
We use fq in order to not alter the score in main query.
We've read that docValues=true for facet fields.
We need also indexed=true?
Is there any other problem in our solution?
-----Message d'origine-----
De : Erick Erickson [mailto:erickerickson@gmail.com]
Envoyé : lundi 19 février 2018 18:18
À : solr-user
Objet : Re: Facet performance problem
I'm confused here. What do you mean by "facet filtering"? Your examples have no facets at all, just a _filter query_.
I'll assume you want to use filter query (fq), and faceting has nothing to do with it. This is one of the tricky bits of docValues.
While it's _possible_ to search on a field that's defined as above, it's very inefficient since there's no "inverted index" for the field, you specified 'indexed="false" '. So the docValues are searched, and it's essentially a table scan.
If you mean to search against this field, set indexed="true". You'll have to completely reindex your corpus of course.
If you intend to facet, group or sort on this field, you should _also_ have docValues="true".
Best,
Erick
On Mon, Feb 19, 2018 at 7:47 AM, MOUSSA MZE Oussama-ext <ou...@pole-emploi.fr> wrote:
> Hi
>
> We have following environement :
>
> 3 nodes cluster
> 1 shard
> Replication factor = 2
> 8GB per node
>
> 29 millions of documents
>
> We've faceting over field "motifPresence" defined as follow:
>
> <field name="motifPresence" type="string" docValues="true"
> indexed="false" stored="true" required="false"/>
>
> Once the user selects motifPresence filter we executes search again with:
>
> fq: (value1 OR value2 OR value3 OR ...)
>
> The problem is: During facet filtering query is too slow and her response time is greater than main search (without facet filtering).
>
> Thanks in advance!
Re: Facet performance problem
Posted by Erick Erickson <er...@gmail.com>.
I'm confused here. What do you mean by "facet filtering"? Your
examples have no facets at all, just a _filter query_.
I'll assume you want to use filter query (fq), and faceting has
nothing to do with it. This is one of the tricky bits of docValues.
While it's _possible_ to search on a field that's defined as above,
it's very inefficient since there's no "inverted index" for the field,
you specified 'indexed="false" '. So the docValues are searched, and
it's essentially a table scan.
If you mean to search against this field, set indexed="true". You'll
have to completely reindex your corpus of course.
If you intend to facet, group or sort on this field, you should _also_
have docValues="true".
Best,
Erick
On Mon, Feb 19, 2018 at 7:47 AM, MOUSSA MZE Oussama-ext
<ou...@pole-emploi.fr> wrote:
> Hi
>
> We have following environement :
>
> 3 nodes cluster
> 1 shard
> Replication factor = 2
> 8GB per node
>
> 29 millions of documents
>
> We've faceting over field "motifPresence" defined as follow:
>
> <field name="motifPresence" type="string" docValues="true" indexed="false" stored="true" required="false"/>
>
> Once the user selects motifPresence filter we executes search again with:
>
> fq: (value1 OR value2 OR value3 OR ...)
>
> The problem is: During facet filtering query is too slow and her response time is greater than main search (without facet filtering).
>
> Thanks in advance!