You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@lucenenet.apache.org by Eran Sevi <er...@gmail.com> on 2009/08/26 15:51:28 UTC

SpanAndQuery new implementation

Hello all,

Currently the only way to simulate a SpanAndQuery was to use SpanNearQuery
with a large enough slop.
The problem with this method is that the returned span contains all the
positions between the AND clauses. This makes it not effective for
highlighting or for knowing what was the original terms that were used in
the query. Also there is the issue of order which shouldn't matter for AND
queries.

I've created a new SpanAndQuery implementation and would like to contribute
it.
My solution returns each matched hit as a different span.

For example:

doc1: a b c d
doc2: a d b c a

a span near query for ( a NEAR c) will result in two spans: doc1:"a b c" and
doc2:"a d b c"
with the new span and query the result will be: doc1: "a" "c", doc2: "a" "c"
"a" (with the correct positions of each span).

How should contribute it? (I tried sending the email to the dev mailing list
but got a weird respond from Minimalist Manager??).

I know that Lucene.Net only follows the jave Lucene project but if someone
is interested in my implementation they can import it to java and add it to
the java version (although changes are probably required because of the
versions differences between java and .Net).

Thanks,
Eran.

Re: SpanAndQuery new implementation

Posted by Eran Sevi <er...@gmail.com>.

Attached is my implementation of SpanAndQuery.
The code is supplied as is which means it's not thouroughly tested (and
probably not the most efficient).
It also contains other changes like weights changes and removing limitation
of same field in clauses which are not related to the specific
implementation.

If you find any problem with the code, please post your remarks/fixes so
everyone can enjoy them as well.

Thanks,
Eran.

On Wed, Aug 26, 2009 at 4:51 PM, Eran Sevi <er...@gmail.com> wrote:

>  Hello all,
>
> Currently the only way to simulate a SpanAndQuery was to use SpanNearQuery
> with a large enough slop.
> The problem with this method is that the returned span contains all the
> positions between the AND clauses. This makes it not effective for
> highlighting or for knowing what was the original terms that were used in
> the query. Also there is the issue of order which shouldn't matter for AND
> queries.
>
> I've created a new SpanAndQuery implementation and would like to contribute
> it.
> My solution returns each matched hit as a different span.
>
> For example:
>
> doc1: a b c d
> doc2: a d b c a
>
> a span near query for ( a NEAR c) will result in two spans: doc1:"a b c"
> and doc2:"a d b c"
> with the new span and query the result will be: doc1: "a" "c", doc2: "a"
> "c" "a" (with the correct positions of each span).
>
> How should contribute it? (I tried sending the email to the dev mailing
> list but got a weird respond from Minimalist Manager??).
>
> I know that Lucene.Net only follows the jave Lucene project but if someone
> is interested in my implementation they can import it to java and add it to
> the java version (although changes are probably required because of the
> versions differences between java and .Net).
>
> Thanks,
> Eran.
>