You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by nagarjuna <na...@gmail.com> on 2011/10/04 10:55:27 UTC
how to avoid duplicates in search results?
Hi everybody....
i got the following response
<code>
<?xml version="1.0" encoding="UTF-8" ?>
- <response>
- <lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
- <lst name="params">
<str name="df">groups</str>
<str name="indent">on</str>
<str name="start">0</str>
<str name="q">participate</str>
<str name="version">2.2</str>
<str name="rows">30</str>
</lst>
</lst>
- <result name="response" numFound="2" start="0">
- <doc>
<str name="description">testing group</str>
<str name="name">testing group</str>
<str
name="url">http://abc.xyz.com/groups/testing-group/discussions/62</str>
</doc>
- <doc>
<str name="description">testing group</str>
<str name="name">testing group</str>
<str
name="url">http://abc.xyz.com/groups/testing-group/discussions/62</str>
</doc>
</result>
</response>
</code>
i need to remove the duplicte results
can anyone give me suggestions
--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-avoid-duplicates-in-search-results-tp3392524p3392524.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to avoid duplicates in search results?
Posted by Chris Hostetter <ho...@fucit.org>.
: There is also a Document Duplicate Detection at index time:
: http://wiki.apache.org/solr/Deduplication
Of just setting "url" as your UniqueKey field would solve this simplr
usecase. but it's not entirely clear what else you consider "duplicates"
besides this one example.
: > - <doc>
: > <str name="description">testing group</str>
: > <str name="name">testing group</str>
: > <str
: > name="url">http://abc.xyz.com/groups/testing-group/discussions/62</str>
: > </doc>
: > - <doc>
: > <str name="description">testing group</str>
: > <str name="name">testing group</str>
: > <str
: > name="url">http://abc.xyz.com/groups/testing-group/discussions/62</str>
: > </doc>
-Hoss
Re: how to avoid duplicates in search results?
Posted by Edoardo Tosca <e....@sourcesense.com>.
You can probably use the Grouping feature:
http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters
There is also a Document Duplicate Detection at index time:
http://wiki.apache.org/solr/Deduplication
On Tue, Oct 4, 2011 at 9:55 AM, nagarjuna <na...@gmail.com>wrote:
> Hi everybody....
> i got the following response
> <code>
> <?xml version="1.0" encoding="UTF-8" ?>
> - <response>
> - <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> - <lst name="params">
> <str name="df">groups</str>
> <str name="indent">on</str>
> <str name="start">0</str>
> <str name="q">participate</str>
> <str name="version">2.2</str>
> <str name="rows">30</str>
> </lst>
> </lst>
> - <result name="response" numFound="2" start="0">
> - <doc>
> <str name="description">testing group</str>
> <str name="name">testing group</str>
> <str
> name="url">http://abc.xyz.com/groups/testing-group/discussions/62</str>
> </doc>
> - <doc>
> <str name="description">testing group</str>
> <str name="name">testing group</str>
> <str
> name="url">http://abc.xyz.com/groups/testing-group/discussions/62</str>
> </doc>
> </result>
> </response>
> </code>
>
> i need to remove the duplicte results
>
> can anyone give me suggestions
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/how-to-avoid-duplicates-in-search-results-tp3392524p3392524.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com