You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Erick Erickson (JIRA)" <ji...@apache.org> on 2014/03/19 01:28:43 UTC

[jira] [Commented] (SOLR-5878) Solr returns duplicates when using distributed search with group.format=simple

    [ https://issues.apache.org/jira/browse/SOLR-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13940005#comment-13940005 ] 

Erick Erickson commented on SOLR-5878:
--------------------------------------

First, please raise issues like this on the user's list first to be
sure it's a bona-fide bug, that reduces the clutter in the JIRAs.

I don't know if this is a real bug or not, you haven't
provided enough data to ascertain that. Is the "name" field
your <uniqueKey>? If not, then there's nothing saying multiple
documents can't have the same name. So return the
<uniqueKey> field (usually id). If you're getting multiple 
uniqueKey fields, that would indicate that you've manually
indexed the same document to different shards, which would
lead to the behavior you're seeing and is expected behavior.

> Solr returns duplicates when using distributed search with group.format=simple
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-5878
>                 URL: https://issues.apache.org/jira/browse/SOLR-5878
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.6
>            Reporter: J.B. Langston
>
> Solr returns duplicate documents when group.format=simple is supplied on a distributed search. This does not happen on the standard group format or when not using distributed search. 
> For example:
> {code}
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*%3A*&fq=evt_stub%3A(452deed8-c3a2-49a8-878d-8356da315e6a)&start=0&rows=5&fl=cont_stub&wt=xml&indent=true&group=true&group.field=cont_stub&group.format=simple&group.limit=1000
> {code}
> Returns:
> {code}
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
> <lst name="responseHeader">
>   <int name="status">0</int>
>   <int name="QTime">253</int>
> </lst>
> <lst name="grouped">
>   <lst name="cont_stub">
>     <int name="matches">56</int>
>     <result name="doclist" numFound="56" start="0" maxScore="1.0">
>       <doc>
>         <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
>       <doc>
>         <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
>       <doc>
>         <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
>       <doc>
>         <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
>       <doc>
>         <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
>       <doc>
>         <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
>       <doc>
>         <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
>       <doc>
>         <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
>       <doc>
>         <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
>       <doc>
>         <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
>       <doc>
>         <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
>       <doc>
>         <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
>       <doc>
>         <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
>       <doc>
>         <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
>       <doc>
>         <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
>       <doc>
>         <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
>     </result>
>   </lst>
> </lst>
> </response>
> {code}
> It should only return 5 documents.  Removing the distributed search and searching on either core will return the requested number of rows. Removing group.format=simple will also return the requested number of rows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org