You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Darren Hartford <dh...@ghsinc.com> on 2007/07/02 14:07:42 UTC
RE: Geneology, nicknames, levenstein, soundex/metaphone, etc
Thank you for the link to the previous thread, lot of information there!
*Synonym use of nicknames - that sounds quite feasible. Do you
specifically mean the WordNet module in the Sandbox, or something
different?
> -----Original Message-----
> From: Grant Ingersoll [mailto:gsingers@apache.org]
> Sent: Friday, June 29, 2007 12:30 PM
> To: java-user@lucene.apache.org
> Subject: Re: Geneology, nicknames, levenstein, soundex/metaphone, etc
>
> You may find this thread useful: http://www.gossamer-threads.com/
> lists/lucene/java-user/47824?search_string=record%20linkage;#47824
> although it doesn't answer all your ?'s
>
> > *nickname: would it be feasible to create an Analyzer that
> will tie
> > to an external/internal nickname datasource (datasource would vary
> > dramatically based on nationality). Usecase: Jon, John, Johnny,
> > Jonathan would have 'weight' in the relevance. Similarly 'Dick',
> > 'Chuck', and 'Charles'.
>
> Maybe you could inject these as synonyms?
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Geneology, nicknames, levenstein, soundex/metaphone, etc
Posted by Grant Ingersoll <gs...@apache.org>.
On Jul 2, 2007, at 8:07 AM, Darren Hartford wrote:
> Thank you for the link to the previous thread, lot of information
> there!
>
> *Synonym use of nicknames - that sounds quite feasible. Do you
> specifically mean the WordNet module in the Sandbox, or something
> different?
No, I think I was thinking along the lines of the SynonymAnalyzer in
Lucene In Action whereby you add the nicknames as tokens at the same
position as the original, that way searches on the nicknames would
still match. Don't know that it solves your need for "weight" in the
relevance, but maybe it would.
>
>
>> -----Original Message-----
>> From: Grant Ingersoll [mailto:gsingers@apache.org]
>> Sent: Friday, June 29, 2007 12:30 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: Geneology, nicknames, levenstein, soundex/metaphone, etc
>>
>> You may find this thread useful: http://www.gossamer-threads.com/
>> lists/lucene/java-user/47824?search_string=record%20linkage;#47824
>> although it doesn't answer all your ?'s
>>
>>> *nickname: would it be feasible to create an Analyzer that
>> will tie
>>> to an external/internal nickname datasource (datasource would vary
>>> dramatically based on nationality). Usecase: Jon, John, Johnny,
>>> Jonathan would have 'weight' in the relevance. Similarly 'Dick',
>>> 'Chuck', and 'Charles'.
>>
>> Maybe you could inject these as synonyms?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org/tech/lucene.asp
Read the Lucene Java FAQ at http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org