You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ben Gill <be...@gmail.com> on 2005/09/17 11:03:39 UTC

Stopping Duplicates

Hi,

I am storing names in my index, and am currently getting duplicates
back (quite correctly, on Lucene's part), because I am storing:

id     name
1      fred
2      fred

What I want to happen is, if a duplicate name is added to the index, I
only ever want one entity to exist with the name....

What is the best way for me to achieve this?

Thanks

Ben

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Stopping Duplicates

Posted by Jeff Rodenburg <je...@gmail.com>.
Ben -

I can think of two ways to achieve this.

1) While adding your information to the index, query the index for an 
existing record. If you get no match, add the record.
2) Control the exclusivity requirement from your data source, so that no 
duplicate records ever have the opportunity to be indexed.

This is an operational question, so the *best* way depends on your overall 
operation, as both of these approaches have consequences on index 
maintenance operations.

Hope this helps.

-- jeff


On 9/17/05, Ben Gill <be...@gmail.com> wrote:
> 
> Hi,
> 
> I am storing names in my index, and am currently getting duplicates
> back (quite correctly, on Lucene's part), because I am storing:
> 
> id name
> 1 fred
> 2 fred
> 
> What I want to happen is, if a duplicate name is added to the index, I
> only ever want one entity to exist with the name....
> 
> What is the best way for me to achieve this?
> 
> Thanks
> 
> Ben
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
>