You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Erik Hatcher <er...@ehatchersolutions.com> on 2006/01/25 12:14:41 UTC
NearSpans issue
In using SpanQuery in a sophisticated way, we've experienced an issue
with NearSpans giving the following exception:
java.lang.RuntimeException: Unexpected: ordered
at
org.apache.lucene.search.spans.NearSpans.firstNonOrderedNextToPartialLis
t(NearSpans.java:291)
at org.apache.lucene.search.spans.NearSpans.next
(NearSpans.java:183)
at org.apache.lucene.search.spans.SpanScorer.next
(SpanScorer.java:50)
at org.apache.lucene.search.BooleanScorer$SubScorer.<init>
(BooleanScorer.java:48)
at org.apache.lucene.search.BooleanScorer.add
(BooleanScorer.java:76)
at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer
(BooleanQuery.java:274)
at org.apache.lucene.search.IndexSearcher.search
(IndexSearcher.java:98)
at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
at org.apache.lucene.search.Hits.<init>(Hits.java:44)
at org.apache.lucene.search.Searcher.search(Searcher.java:44)
at org.apache.lucene.search.Searcher.search(Searcher.java:36)
This is using the current trunk code. I believe this may be related
to http://issues.apache.org/jira/browse/LUCENE-413
Unfortunately I cannot share the exact index or query, but the query
ends up being a BooleanQuery with about 10 clauses, some being simple
TermQuery's, and others being SpanNearQuery's nested with
SpanRegexQuery's and SpanTermQuery's.
Thoughts? Is this related to LUCENE-413? Shall I re-apply all those
patches (if they still work) and give it another try?
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: NearSpans issue
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
We're working on duplicating this issue with a general and small
index. The index was built with a trunk version of Lucene.
I'll re-patch things soon and see how that goes.
Erik
On Jan 25, 2006, at 12:08 PM, Paul Elschot wrote:
> On Wednesday 25 January 2006 12:14, Erik Hatcher wrote:
>> In using SpanQuery in a sophisticated way, we've experienced an issue
>> with NearSpans giving the following exception:
>>
>> java.lang.RuntimeException: Unexpected: ordered
>> at
>> org.apache.lucene.search.spans.NearSpans.firstNonOrderedNextToPartial
>> Lis
>> t(NearSpans.java:291)
>
> In all likelyhood this code is buggy, and I would prefer to use the
> NearSpansOrdered things from the LUCENE-413 as you indicated.
>
>> at org.apache.lucene.search.spans.NearSpans.next
>> (NearSpans.java:183)
>> at org.apache.lucene.search.spans.SpanScorer.next
>> (SpanScorer.java:50)
>> at org.apache.lucene.search.BooleanScorer$SubScorer.<init>
>> (BooleanScorer.java:48)
>> at org.apache.lucene.search.BooleanScorer.add
>> (BooleanScorer.java:76)
>> at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer
>> (BooleanQuery.java:274)
>> at org.apache.lucene.search.IndexSearcher.search
>> (IndexSearcher.java:98)
>> at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
>> at org.apache.lucene.search.Hits.<init>(Hits.java:44)
>> at org.apache.lucene.search.Searcher.search(Searcher.java:44)
>> at org.apache.lucene.search.Searcher.search(Searcher.java:36)
>>
>> This is using the current trunk code. I believe this may be related
>> to http://issues.apache.org/jira/browse/LUCENE-413
>
> Was the trunk code also used to create the index? I (very) vaguely
> recall
> seeing another message hinting (to me at least) that this might be
> related
> to building the index with an older version. I'll also have a look
> through
> my archives for this.
>
>>
>> Unfortunately I cannot share the exact index or query, but the query
>> ends up being a BooleanQuery with about 10 clauses, some being simple
>> TermQuery's, and others being SpanNearQuery's nested with
>> SpanRegexQuery's and SpanTermQuery's.
>>
>> Thoughts? Is this related to LUCENE-413? Shall I re-apply all those
>> patches (if they still work) and give it another try?
>
> I would certainly like to get the NearSpans code correct, and I could
> continue bug hunting as earlier. I would prefer to start from the
> NearSpansOrdered/Unordered code as posted there because that
> will at least get the "Unexpected: ordered" exception out of the way.
>
> Regards,
> Paul Elschot
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: NearSpans issue
Posted by Paul Elschot <pa...@xs4all.nl>.
On Wednesday 25 January 2006 12:14, Erik Hatcher wrote:
> In using SpanQuery in a sophisticated way, we've experienced an issue
> with NearSpans giving the following exception:
>
> java.lang.RuntimeException: Unexpected: ordered
> at
> org.apache.lucene.search.spans.NearSpans.firstNonOrderedNextToPartialLis
> t(NearSpans.java:291)
In all likelyhood this code is buggy, and I would prefer to use the
NearSpansOrdered things from the LUCENE-413 as you indicated.
> at org.apache.lucene.search.spans.NearSpans.next
> (NearSpans.java:183)
> at org.apache.lucene.search.spans.SpanScorer.next
> (SpanScorer.java:50)
> at org.apache.lucene.search.BooleanScorer$SubScorer.<init>
> (BooleanScorer.java:48)
> at org.apache.lucene.search.BooleanScorer.add
> (BooleanScorer.java:76)
> at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer
> (BooleanQuery.java:274)
> at org.apache.lucene.search.IndexSearcher.search
> (IndexSearcher.java:98)
> at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> at org.apache.lucene.search.Hits.<init>(Hits.java:44)
> at org.apache.lucene.search.Searcher.search(Searcher.java:44)
> at org.apache.lucene.search.Searcher.search(Searcher.java:36)
>
> This is using the current trunk code. I believe this may be related
> to http://issues.apache.org/jira/browse/LUCENE-413
Was the trunk code also used to create the index? I (very) vaguely recall
seeing another message hinting (to me at least) that this might be related
to building the index with an older version. I'll also have a look through
my archives for this.
>
> Unfortunately I cannot share the exact index or query, but the query
> ends up being a BooleanQuery with about 10 clauses, some being simple
> TermQuery's, and others being SpanNearQuery's nested with
> SpanRegexQuery's and SpanTermQuery's.
>
> Thoughts? Is this related to LUCENE-413? Shall I re-apply all those
> patches (if they still work) and give it another try?
I would certainly like to get the NearSpans code correct, and I could
continue bug hunting as earlier. I would prefer to start from the
NearSpansOrdered/Unordered code as posted there because that
will at least get the "Unexpected: ordered" exception out of the way.
Regards,
Paul Elschot
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: NearSpans issue
Posted by David Balmain <db...@gmail.com>.
Hi Erik,
The only way I can see this exception being thrown is when you have
two SpanCells with the same start in a particular document. In this
case matchIsOrdered will return false even though the SpanCells may
still be ordered in the priority queue. The current code for
matchIsOrdered is;
private boolean matchIsOrdered() {
int lastStart = -1;
for (int i = 0; i < ordered.size(); i++) {
int start = ((SpansCell)ordered.get(i)).start();
if (!(start > lastStart))
return false;
lastStart = start;
}
return true;
}
I think maybe;
if (!(start > lastStart))
should be;
if (start < lastStart)
I'm afraid I don't have time to test this but hopefully it'll help.
Cheers,
Dave
On 1/25/06, Erik Hatcher <er...@ehatchersolutions.com> wrote:
> In using SpanQuery in a sophisticated way, we've experienced an issue
> with NearSpans giving the following exception:
>
> java.lang.RuntimeException: Unexpected: ordered
> at
> org.apache.lucene.search.spans.NearSpans.firstNonOrderedNextToPartialLis
> t(NearSpans.java:291)
> at org.apache.lucene.search.spans.NearSpans.next
> (NearSpans.java:183)
> at org.apache.lucene.search.spans.SpanScorer.next
> (SpanScorer.java:50)
> at org.apache.lucene.search.BooleanScorer$SubScorer.<init>
> (BooleanScorer.java:48)
> at org.apache.lucene.search.BooleanScorer.add
> (BooleanScorer.java:76)
> at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer
> (BooleanQuery.java:274)
> at org.apache.lucene.search.IndexSearcher.search
> (IndexSearcher.java:98)
> at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> at org.apache.lucene.search.Hits.<init>(Hits.java:44)
> at org.apache.lucene.search.Searcher.search(Searcher.java:44)
> at org.apache.lucene.search.Searcher.search(Searcher.java:36)
>
> This is using the current trunk code. I believe this may be related
> to http://issues.apache.org/jira/browse/LUCENE-413
>
> Unfortunately I cannot share the exact index or query, but the query
> ends up being a BooleanQuery with about 10 clauses, some being simple
> TermQuery's, and others being SpanNearQuery's nested with
> SpanRegexQuery's and SpanTermQuery's.
>
> Thoughts? Is this related to LUCENE-413? Shall I re-apply all those
> patches (if they still work) and give it another try?
>
> Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: NearSpans issue
Posted by Paul Elschot <pa...@xs4all.nl>.
On Friday 27 January 2006 16:33, Erik Hatcher wrote:
> One of the engineers decided to remove that exception to see what
> effect it had on our particular troubling query:
>
> Index: src/java/org/apache/lucene/search/spans/NearSpans.java
> ===================================================================
> --- src/java/org/apache/lucene/search/spans/NearSpans.java
> (revision 372606)
> +++ src/java/org/apache/lucene/search/spans/NearSpans.java
> (working copy)
> @@ -288,7 +288,8 @@
> // When queue is empty and checkSlop() and ordered there is
> a match.
> }
> }
> - throw new RuntimeException("Unexpected: ordered");
> + //throw new RuntimeException("Unexpected: ordered");
> + return false;
> }
>
> private void listToQueue() {
>
>
> And the error with that query went away and the results were accurate.
A pragmatic approach.
> I realize this isn't the long-term fix, but wanted to report where
> things stand.
At the time this code fixed another bug in the span logic.
I vaguely recall that I was not certain what should be done
when the execution reaches that particular point, and I put the exception
in to make sure that something would be done about it when needed.
So it is actually unfinished code, and next time I'll try harder not to
use that way of coding again.
>
> I'm still going to patch things up locally with those JIRA patches
> and see where that takes it. We've not had success building a
> generic index that we can share that duplicates this issue,
> unfortunately.
I hope those patches solve the problem.
Regards,
Paul Elschot
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: NearSpans issue
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
One of the engineers decided to remove that exception to see what
effect it had on our particular troubling query:
Index: src/java/org/apache/lucene/search/spans/NearSpans.java
===================================================================
--- src/java/org/apache/lucene/search/spans/NearSpans.java
(revision 372606)
+++ src/java/org/apache/lucene/search/spans/NearSpans.java
(working copy)
@@ -288,7 +288,8 @@
// When queue is empty and checkSlop() and ordered there is
a match.
}
}
- throw new RuntimeException("Unexpected: ordered");
+ //throw new RuntimeException("Unexpected: ordered");
+ return false;
}
private void listToQueue() {
And the error with that query went away and the results were accurate.
I realize this isn't the long-term fix, but wanted to report where
things stand.
I'm still going to patch things up locally with those JIRA patches
and see where that takes it. We've not had success building a
generic index that we can share that duplicates this issue,
unfortunately.
Erik
On Jan 25, 2006, at 6:14 AM, Erik Hatcher wrote:
> In using SpanQuery in a sophisticated way, we've experienced an
> issue with NearSpans giving the following exception:
>
> java.lang.RuntimeException: Unexpected: ordered
> at
> org.apache.lucene.search.spans.NearSpans.firstNonOrderedNextToPartialL
> ist(NearSpans.java:291)
> at org.apache.lucene.search.spans.NearSpans.next
> (NearSpans.java:183)
> at org.apache.lucene.search.spans.SpanScorer.next
> (SpanScorer.java:50)
> at org.apache.lucene.search.BooleanScorer$SubScorer.<init>
> (BooleanScorer.java:48)
> at org.apache.lucene.search.BooleanScorer.add
> (BooleanScorer.java:76)
> at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer
> (BooleanQuery.java:274)
> at org.apache.lucene.search.IndexSearcher.search
> (IndexSearcher.java:98)
> at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> at org.apache.lucene.search.Hits.<init>(Hits.java:44)
> at org.apache.lucene.search.Searcher.search(Searcher.java:44)
> at org.apache.lucene.search.Searcher.search(Searcher.java:36)
>
> This is using the current trunk code. I believe this may be
> related to http://issues.apache.org/jira/browse/LUCENE-413
>
> Unfortunately I cannot share the exact index or query, but the
> query ends up being a BooleanQuery with about 10 clauses, some
> being simple TermQuery's, and others being SpanNearQuery's nested
> with SpanRegexQuery's and SpanTermQuery's.
>
> Thoughts? Is this related to LUCENE-413? Shall I re-apply all
> those patches (if they still work) and give it another try?
>
> Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org