You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2013/06/12 21:18:20 UTC

[jira] [Updated] (SOLR-4866) FieldCache insanity when field is used in both faceting and grouping in distributed search (distributed grouping uses SortedDocValues)

     [ https://issues.apache.org/jira/browse/SOLR-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-4866:
---------------------------

    Description: 
Faceting on a fieldX, either single node or distributed, uses the FieldType of fieldX to fetch a type based array of field values.  Grouping on fieldX uses the same type based arrays in single node solr instances -- but when using distributed grouping, the multipass grouping logic uses SortedDocValues from the FieldCache for fieldX, resulting in "field cache insanity" if you also facet on this field, or execute a query against a single shard.

This descrepency can be observed in the example cnfigs by executing a simple grouping query, and then also executing a distributed grouping query...

http://localhost:8983/solr/select?q=*:*&group=true&group.field=popularity
http://localhost:8983/solr/select?q=*:*&group=true&group.field=popularity&shards=localhost:8983/solr
http://localhost:8983/solr/admin/mbeans?stats=true&key=fieldCache



Background: http://markmail.org/thread/7gctyh6vn3eq5jso

  was:
I am using the Lucene FieldCache with SolrCloud 4.2.1 and I have "insane" instances for a field used as facet and group field.

schema fieldType & filed declaration for my
merchantid field :
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" sortMissingLast="true" omitNorms="true" positionIncrementGap="0"/>

<field name="merchantid" type="int" indexed="true" stored="true" required="true"/>

The mbean stats output shows the field cache insanity after executing queries like :
/select?q=*:*&facet=true&facet.field=merchantid
/select?q=*:*&group=true&group.field=merchantid

<int name="insanity_count">25</int>
<str name="insanity#0">VALUEMISMATCH: Multiple distinct value objects for SegmentCoreReader(owner=_1z1(4.2.1):C3916)+merchantid
	'SegmentCoreReader(owner=_1z1(4.2.1):C3916)'=>'merchantid',class org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#1517585400
	'SegmentCoreReader(owner=_1z1(4.2.1):C3916)'=>'merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#781169939
	'SegmentCoreReader(owner=_1z1(4.2.1):C3916)'=>'merchantid',int,null=>org.apache.lucene.search.FieldCacheImpl$IntsFromArray#781169939
</str>
...

see http://markmail.org/thread/7gctyh6vn3eq5jso

        Summary: FieldCache insanity when field is used in both faceting and grouping in distributed search (distributed grouping uses SortedDocValues)  (was: FieldCache insanity with field used as facet and group)

updating summary and revising description based on narrowing down the root of the discrepancy
                
> FieldCache insanity when field is used in both faceting and grouping in distributed search (distributed grouping uses SortedDocValues)
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4866
>                 URL: https://issues.apache.org/jira/browse/SOLR-4866
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Sannier Elodie
>            Priority: Minor
>
> Faceting on a fieldX, either single node or distributed, uses the FieldType of fieldX to fetch a type based array of field values.  Grouping on fieldX uses the same type based arrays in single node solr instances -- but when using distributed grouping, the multipass grouping logic uses SortedDocValues from the FieldCache for fieldX, resulting in "field cache insanity" if you also facet on this field, or execute a query against a single shard.
> This descrepency can be observed in the example cnfigs by executing a simple grouping query, and then also executing a distributed grouping query...
> http://localhost:8983/solr/select?q=*:*&group=true&group.field=popularity
> http://localhost:8983/solr/select?q=*:*&group=true&group.field=popularity&shards=localhost:8983/solr
> http://localhost:8983/solr/admin/mbeans?stats=true&key=fieldCache
> Background: http://markmail.org/thread/7gctyh6vn3eq5jso

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org