You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Helmut Jarausch <ja...@igpm.rwth-aachen.de> on 2007/12/17 09:41:58 UTC

FuzzyQuery - rounding bug?

Hi,

according to the LiA book the FuzzyQuery distance is computed as

1- distance / min(textlen,targetlen)

Given
def addDoc(text, writer):
    doc = Document()
    doc.add(Field("field", text,
                  Field.Store.YES, Field.Index.TOKENIZED))
    writer.addDocument(doc)
    
addDoc("aaaaa", writer)
addDoc("aaaab", writer)
addDoc("aaabb", writer)
addDoc("aabbb", writer)
addDoc("abbbb", writer)
addDoc("bbbbb", writer)
addDoc("ddddd", writer)

query = FuzzyQuery(Term("field", "aaaaa"),0.8,0)

should find "aaaab' since we have
distance = 1
min(textlen,targetlen) = 5

It does find it with
query = FuzzyQuery(Term("field", "aaaaa"),0.79,0)
though.

Is there a rounding error bug?

(this is with lucene-java-2.2.0-603782)

Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: FuzzyQuery - rounding bug?

Posted by Erick Erickson <er...@gmail.com>.
Please do not highack the thread. When starting a new topic, do NOT
use "reply to", start an entirely new e-mail. Otherwise your topic often
gets ignored by people who are uninterested in the original thread.

Best
Erick

On Dec 17, 2007 5:57 AM, anjana m <an...@gmail.com> wrote:

> how to i use lucene search to serach files of the local system
>
> On Dec 17, 2007 2:11 PM, Helmut Jarausch <ja...@igpm.rwth-aachen.de>
> wrote:
>
> > Hi,
> >
> > according to the LiA book the FuzzyQuery distance is computed as
> >
> > 1- distance / min(textlen,targetlen)
> >
> > Given
> > def addDoc(text, writer):
> >    doc = Document()
> >    doc.add(Field("field", text,
> >                  Field.Store.YES, Field.Index.TOKENIZED))
> >    writer.addDocument(doc)
> >
> > addDoc("aaaaa", writer)
> > addDoc("aaaab", writer)
> > addDoc("aaabb", writer)
> > addDoc("aabbb", writer)
> > addDoc("abbbb", writer)
> > addDoc("bbbbb", writer)
> > addDoc("ddddd", writer)
> >
> > query = FuzzyQuery(Term("field", "aaaaa"),0.8,0)
> >
> > should find "aaaab' since we have
> > distance = 1
> > min(textlen,targetlen) = 5
> >
> > It does find it with
> > query = FuzzyQuery(Term("field", "aaaaa"),0.79,0)
> > though.
> >
> > Is there a rounding error bug?
> >
> > (this is with lucene-java-2.2.0-603782)
> >
> > Helmut Jarausch
> >
> > Lehrstuhl fuer Numerische Mathematik
> > RWTH - Aachen University
> > D 52056 Aachen, Germany
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: FuzzyQuery - rounding bug?

Posted by anjana m <an...@gmail.com>.
how to i use lucene search to serach files of the local system

On Dec 17, 2007 2:11 PM, Helmut Jarausch <ja...@igpm.rwth-aachen.de>
wrote:

> Hi,
>
> according to the LiA book the FuzzyQuery distance is computed as
>
> 1- distance / min(textlen,targetlen)
>
> Given
> def addDoc(text, writer):
>    doc = Document()
>    doc.add(Field("field", text,
>                  Field.Store.YES, Field.Index.TOKENIZED))
>    writer.addDocument(doc)
>
> addDoc("aaaaa", writer)
> addDoc("aaaab", writer)
> addDoc("aaabb", writer)
> addDoc("aabbb", writer)
> addDoc("abbbb", writer)
> addDoc("bbbbb", writer)
> addDoc("ddddd", writer)
>
> query = FuzzyQuery(Term("field", "aaaaa"),0.8,0)
>
> should find "aaaab' since we have
> distance = 1
> min(textlen,targetlen) = 5
>
> It does find it with
> query = FuzzyQuery(Term("field", "aaaaa"),0.79,0)
> though.
>
> Is there a rounding error bug?
>
> (this is with lucene-java-2.2.0-603782)
>
> Helmut Jarausch
>
> Lehrstuhl fuer Numerische Mathematik
> RWTH - Aachen University
> D 52056 Aachen, Germany
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>