You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Elodie Sannier <el...@kelkoo.fr> on 2013/04/25 15:22:11 UTC
FieldCache insanity with field used as facet and group
Hello,
I am using the Lucene FieldCache with SolrCloud and I have "insane" instances with messages like:
VALUEMISMATCH: Multiple distinct value objects for SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)+merchantid 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',class org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#557711353 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',int,null=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
All insane instances are for a field "merchantid" of type "int" used as facet and group field.
I'm using a custom SearchHandler which makes two sub-queries, a first query with group.field=merchantid and a second query with facet.field=merchantid.
When I'm using the parameter facet.method=enum, I don't have the insane instance but I'm not sure it is the good fix.
This insanity can have performance impact ?
How can I fix it ?
Elodie Sannier
Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris
Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: FieldCache insanity with field used as facet and group
Posted by Elodie Sannier <el...@kelkoo.fr>.
I'm reproducing the problem with the 4.2.1 example with 2 shards.
1) started up solr shards, indexed the example data, and confirmed empty
fieldCaches
[sanniere@funlevel-dx example]$ java
-Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar
[sanniere@funlevel-dx example2]$ java -Djetty.port=7574
-DzkHost=localhost:9983 -jar start.jar
2) used both grouping and faceting on the popularity field, then checked
the fieldcache insanity count
[sanniere@funlevel-dx example]$ curl -sS
"http://localhost:8983/solr/select?q=*:*&group=true&group.field=popularity"
> /dev/null
[sanniere@funlevel-dx example]$ curl -sS
"http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=popularity"
> /dev/null
[sanniere@funlevel-dx example]$ curl -sS
"http://localhost:8983/solr/admin/mbeans?stats=true&key=fieldCache&wt=json&indent=true"
| grep "entries_count|insanity_count"
"entries_count":10,
"insanity_count":2,
"insanity#0":"VALUEMISMATCH: Multiple distinct value objects for
SegmentCoreReader(owner=_g(4.2.1):C1)+popularity\n\t'SegmentCoreReader(owner=_g(4.2.1):C1)'=>'popularity',class
org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#12129794\n\t'SegmentCoreReader(owner=_g(4.2.1):C1)'=>'popularity',int,null=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#12298774\n\t'SegmentCoreReader(owner=_g(4.2.1):C1)'=>'popularity',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#12298774\n",
"insanity#1":"VALUEMISMATCH: Multiple distinct value objects for
SegmentCoreReader(owner=_f(4.2.1):C9)+popularity\n\t'SegmentCoreReader(owner=_f(4.2.1):C9)'=>'popularity',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#16648315\n\t'SegmentCoreReader(owner=_f(4.2.1):C9)'=>'popularity',int,null=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#16648315\n\t'SegmentCoreReader(owner=_f(4.2.1):C9)'=>'popularity',class
org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#1130715\n"}}},
"HIGHLIGHTING",{},
"OTHER",{}]}
I've updated https://issues.apache.org/jira/browse/SOLR-4866
Elodie
Le 28.05.2013 10:22, Elodie Sannier a écrit :
> I've created https://issues.apache.org/jira/browse/SOLR-4866
>
> Elodie
>
> Le 07.05.2013 18:19, Chris Hostetter a écrit :
>> : I am using the Lucene FieldCache with SolrCloud and I have "insane" instances
>> : with messages like:
>>
>> FWIW: I'm the one that named the result of these "sanity checks"
>> "FieldCacheInsantity" and i have regretted it ever since -- a better label
>> would have been "inconsistency"
>>
>> : VALUEMISMATCH: Multiple distinct value objects for
>> : SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)+merchantid
>> : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',class
>> : org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#557711353
>> : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',int,null=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
>> : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
>> :
>> : All insane instances are for a field "merchantid" of type "int" used as facet
>> : and group field.
>>
>> Interesting: it appears that the grouping code and the facet code are not
>> being consistent in how they are building hte field cache, so you are
>> getting two objects in the cache for each segment
>>
>> I haven't checked if this happens much with the example configs, but if
>> you could: please file a bug with the details of which Solr version you
>> are using along with the schema fieldType& filed declarations for your
>> merchantid field, along with the mbean stats output showing the field
>> cache insanity after executing two queries like...
>>
>> /select?q=*:*&facet=true&facet.field=merchantid
>> /select?q=*:*&group=true&group.field=merchantid
>>
>> (that way we can rule out your custom SearchComponent as having a bug in
>> it)
>>
>> : This insanity can have performance impact ?
>> : How can I fix it ?
>>
>> the impact is just that more ram is being used them is probably strictly
>> neccessary. unless there is something unusual in your fieldType
>> delcataion, i don't think there is an easy fix you can apply -- we need to
>> fix the underlying code.
>>
>> -Hoss
>
> --
> Kelkoo
>
> *Elodie Sannier *Software engineer
>
> *E*elodie.sannier@kelkoo.fr<ma...@kelkoo.fr>
> *Y!Messenger* kelkooelodies
> *T* +33 (0)4 56 09 07 55 *M*
> *A* 4/6 Rue des Méridiens 38130 Echirolles
>
>
>
>
> Kelkoo SAS
> Société par Actions Simplifiée
> Au capital de € 4.168.964,30
> Siège social : 8, rue du Sentier 75002 Paris
> 425 093 069 RCS Paris
>
> Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
--
Kelkoo
*Elodie Sannier *Software engineer
*E*elodie.sannier@kelkoo.fr <ma...@kelkoo.fr>
*Y!Messenger* kelkooelodies
*T* +33 (0)4 56 09 07 55 *M*
*A* 4/6 Rue des Méridiens 38130 Echirolles
Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris
Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: FieldCache insanity with field used as facet and group
Posted by Elodie Sannier <el...@kelkoo.fr>.
I've created https://issues.apache.org/jira/browse/SOLR-4866
Elodie
Le 07.05.2013 18:19, Chris Hostetter a écrit :
> : I am using the Lucene FieldCache with SolrCloud and I have "insane" instances
> : with messages like:
>
> FWIW: I'm the one that named the result of these "sanity checks"
> "FieldCacheInsantity" and i have regretted it ever since -- a better label
> would have been "inconsistency"
>
> : VALUEMISMATCH: Multiple distinct value objects for
> : SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)+merchantid
> : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',class
> : org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#557711353
> : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',int,null=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
> : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
> :
> : All insane instances are for a field "merchantid" of type "int" used as facet
> : and group field.
>
> Interesting: it appears that the grouping code and the facet code are not
> being consistent in how they are building hte field cache, so you are
> getting two objects in the cache for each segment
>
> I haven't checked if this happens much with the example configs, but if
> you could: please file a bug with the details of which Solr version you
> are using along with the schema fieldType& filed declarations for your
> merchantid field, along with the mbean stats output showing the field
> cache insanity after executing two queries like...
>
> /select?q=*:*&facet=true&facet.field=merchantid
> /select?q=*:*&group=true&group.field=merchantid
>
> (that way we can rule out your custom SearchComponent as having a bug in
> it)
>
> : This insanity can have performance impact ?
> : How can I fix it ?
>
> the impact is just that more ram is being used them is probably strictly
> neccessary. unless there is something unusual in your fieldType
> delcataion, i don't think there is an easy fix you can apply -- we need to
> fix the underlying code.
>
> -Hoss
--
Kelkoo
*Elodie Sannier *Software engineer
*E*elodie.sannier@kelkoo.fr <ma...@kelkoo.fr>
*Y!Messenger* kelkooelodies
*T* +33 (0)4 56 09 07 55 *M*
*A* 4/6 Rue des Méridiens 38130 Echirolles
Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris
Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: FieldCache insanity with field used as facet and group
Posted by Chris Hostetter <ho...@fucit.org>.
: I am using the Lucene FieldCache with SolrCloud and I have "insane" instances
: with messages like:
FWIW: I'm the one that named the result of these "sanity checks"
"FieldCacheInsantity" and i have regretted it ever since -- a better label
would have been "inconsistency"
: VALUEMISMATCH: Multiple distinct value objects for
: SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)+merchantid
: 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',class
: org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#557711353
: 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',int,null=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
: 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'=>'merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
:
: All insane instances are for a field "merchantid" of type "int" used as facet
: and group field.
Interesting: it appears that the grouping code and the facet code are not
being consistent in how they are building hte field cache, so you are
getting two objects in the cache for each segment
I haven't checked if this happens much with the example configs, but if
you could: please file a bug with the details of which Solr version you
are using along with the schema fieldType & filed declarations for your
merchantid field, along with the mbean stats output showing the field
cache insanity after executing two queries like...
/select?q=*:*&facet=true&facet.field=merchantid
/select?q=*:*&group=true&group.field=merchantid
(that way we can rule out your custom SearchComponent as having a bug in
it)
: This insanity can have performance impact ?
: How can I fix it ?
the impact is just that more ram is being used them is probably strictly
neccessary. unless there is something unusual in your fieldType
delcataion, i don't think there is an easy fix you can apply -- we need to
fix the underlying code.
-Hoss