You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Erik Hatcher <er...@ehatchersolutions.com> on 2006/01/25 12:14:41 UTC

NearSpans issue

In using SpanQuery in a sophisticated way, we've experienced an issue  
with NearSpans giving the following exception:

java.lang.RuntimeException: Unexpected: ordered
        at  
org.apache.lucene.search.spans.NearSpans.firstNonOrderedNextToPartialLis 
t(NearSpans.java:291)
        at org.apache.lucene.search.spans.NearSpans.next 
(NearSpans.java:183)
        at org.apache.lucene.search.spans.SpanScorer.next 
(SpanScorer.java:50)
        at org.apache.lucene.search.BooleanScorer$SubScorer.<init> 
(BooleanScorer.java:48)
        at org.apache.lucene.search.BooleanScorer.add 
(BooleanScorer.java:76)
        at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer 
(BooleanQuery.java:274)
        at org.apache.lucene.search.IndexSearcher.search 
(IndexSearcher.java:98)
        at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
        at org.apache.lucene.search.Hits.<init>(Hits.java:44)
        at org.apache.lucene.search.Searcher.search(Searcher.java:44)
        at org.apache.lucene.search.Searcher.search(Searcher.java:36)

This is using the current trunk code.   I believe this may be related  
to http://issues.apache.org/jira/browse/LUCENE-413

Unfortunately I cannot share the exact index or query, but the query  
ends up being a BooleanQuery with about 10 clauses, some being simple  
TermQuery's, and others being SpanNearQuery's nested with  
SpanRegexQuery's and SpanTermQuery's.

Thoughts?  Is this related to LUCENE-413?  Shall I re-apply all those  
patches (if they still work) and give it another try?

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: NearSpans issue

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
We're working on duplicating this issue with a general and small  
index.  The index was built with a trunk version of Lucene.

I'll re-patch things soon and see how that goes.

	Erik



On Jan 25, 2006, at 12:08 PM, Paul Elschot wrote:
> On Wednesday 25 January 2006 12:14, Erik Hatcher wrote:
>> In using SpanQuery in a sophisticated way, we've experienced an issue
>> with NearSpans giving the following exception:
>>
>> java.lang.RuntimeException: Unexpected: ordered
>>         at
>> org.apache.lucene.search.spans.NearSpans.firstNonOrderedNextToPartial 
>> Lis
>> t(NearSpans.java:291)
>
> In all likelyhood this code is buggy, and I would prefer to use the
> NearSpansOrdered things from the LUCENE-413 as you indicated.
>
>>         at org.apache.lucene.search.spans.NearSpans.next
>> (NearSpans.java:183)
>>         at org.apache.lucene.search.spans.SpanScorer.next
>> (SpanScorer.java:50)
>>         at org.apache.lucene.search.BooleanScorer$SubScorer.<init>
>> (BooleanScorer.java:48)
>>         at org.apache.lucene.search.BooleanScorer.add
>> (BooleanScorer.java:76)
>>         at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer
>> (BooleanQuery.java:274)
>>         at org.apache.lucene.search.IndexSearcher.search
>> (IndexSearcher.java:98)
>>         at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
>>         at org.apache.lucene.search.Hits.<init>(Hits.java:44)
>>         at org.apache.lucene.search.Searcher.search(Searcher.java:44)
>>         at org.apache.lucene.search.Searcher.search(Searcher.java:36)
>>
>> This is using the current trunk code.   I believe this may be related
>> to http://issues.apache.org/jira/browse/LUCENE-413
>
> Was the trunk code also used to create the index? I (very) vaguely  
> recall
> seeing another message hinting (to me at least) that this might be  
> related
> to building the index with an older version. I'll also have a look  
> through
> my archives for this.
>
>>
>> Unfortunately I cannot share the exact index or query, but the query
>> ends up being a BooleanQuery with about 10 clauses, some being simple
>> TermQuery's, and others being SpanNearQuery's nested with
>> SpanRegexQuery's and SpanTermQuery's.
>>
>> Thoughts?  Is this related to LUCENE-413?  Shall I re-apply all those
>> patches (if they still work) and give it another try?
>
> I would certainly like to get the NearSpans code correct, and I could
> continue bug hunting as earlier. I would prefer to start from the
> NearSpansOrdered/Unordered code as posted there because that
> will at least get the "Unexpected: ordered" exception out of the way.
>
> Regards,
> Paul Elschot
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: NearSpans issue

Posted by Paul Elschot <pa...@xs4all.nl>.
On Wednesday 25 January 2006 12:14, Erik Hatcher wrote:
> In using SpanQuery in a sophisticated way, we've experienced an issue  
> with NearSpans giving the following exception:
> 
> java.lang.RuntimeException: Unexpected: ordered
>         at  
> org.apache.lucene.search.spans.NearSpans.firstNonOrderedNextToPartialLis 
> t(NearSpans.java:291)

In all likelyhood this code is buggy, and I would prefer to use the
NearSpansOrdered things from the LUCENE-413 as you indicated.

>         at org.apache.lucene.search.spans.NearSpans.next 
> (NearSpans.java:183)
>         at org.apache.lucene.search.spans.SpanScorer.next 
> (SpanScorer.java:50)
>         at org.apache.lucene.search.BooleanScorer$SubScorer.<init> 
> (BooleanScorer.java:48)
>         at org.apache.lucene.search.BooleanScorer.add 
> (BooleanScorer.java:76)
>         at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer 
> (BooleanQuery.java:274)
>         at org.apache.lucene.search.IndexSearcher.search 
> (IndexSearcher.java:98)
>         at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
>         at org.apache.lucene.search.Hits.<init>(Hits.java:44)
>         at org.apache.lucene.search.Searcher.search(Searcher.java:44)
>         at org.apache.lucene.search.Searcher.search(Searcher.java:36)
> 
> This is using the current trunk code.   I believe this may be related  
> to http://issues.apache.org/jira/browse/LUCENE-413

Was the trunk code also used to create the index? I (very) vaguely recall
seeing another message hinting (to me at least) that this might be related
to building the index with an older version. I'll also have a look through
my archives for this.

> 
> Unfortunately I cannot share the exact index or query, but the query  
> ends up being a BooleanQuery with about 10 clauses, some being simple  
> TermQuery's, and others being SpanNearQuery's nested with  
> SpanRegexQuery's and SpanTermQuery's.
> 
> Thoughts?  Is this related to LUCENE-413?  Shall I re-apply all those  
> patches (if they still work) and give it another try?

I would certainly like to get the NearSpans code correct, and I could
continue bug hunting as earlier. I would prefer to start from the
NearSpansOrdered/Unordered code as posted there because that
will at least get the "Unexpected: ordered" exception out of the way.

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: NearSpans issue

Posted by David Balmain <db...@gmail.com>.
Hi Erik,

The only way I can see this exception being thrown is when you have
two SpanCells with the same start in a particular document. In this
case matchIsOrdered will return false even though the SpanCells may
still be ordered in the priority queue. The current code for
matchIsOrdered is;

  private boolean matchIsOrdered() {
    int lastStart = -1;
    for (int i = 0; i < ordered.size(); i++) {
      int start = ((SpansCell)ordered.get(i)).start();
      if (!(start > lastStart))
        return false;
      lastStart = start;
    }
    return true;
  }

I think maybe;

      if (!(start > lastStart))

should be;

      if (start < lastStart)

I'm afraid I don't have time to test this but hopefully it'll help.

Cheers,
Dave

On 1/25/06, Erik Hatcher <er...@ehatchersolutions.com> wrote:
> In using SpanQuery in a sophisticated way, we've experienced an issue
> with NearSpans giving the following exception:
>
> java.lang.RuntimeException: Unexpected: ordered
>         at
> org.apache.lucene.search.spans.NearSpans.firstNonOrderedNextToPartialLis
> t(NearSpans.java:291)
>         at org.apache.lucene.search.spans.NearSpans.next
> (NearSpans.java:183)
>         at org.apache.lucene.search.spans.SpanScorer.next
> (SpanScorer.java:50)
>         at org.apache.lucene.search.BooleanScorer$SubScorer.<init>
> (BooleanScorer.java:48)
>         at org.apache.lucene.search.BooleanScorer.add
> (BooleanScorer.java:76)
>         at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer
> (BooleanQuery.java:274)
>         at org.apache.lucene.search.IndexSearcher.search
> (IndexSearcher.java:98)
>         at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
>         at org.apache.lucene.search.Hits.<init>(Hits.java:44)
>         at org.apache.lucene.search.Searcher.search(Searcher.java:44)
>         at org.apache.lucene.search.Searcher.search(Searcher.java:36)
>
> This is using the current trunk code.   I believe this may be related
> to http://issues.apache.org/jira/browse/LUCENE-413
>
> Unfortunately I cannot share the exact index or query, but the query
> ends up being a BooleanQuery with about 10 clauses, some being simple
> TermQuery's, and others being SpanNearQuery's nested with
> SpanRegexQuery's and SpanTermQuery's.
>
> Thoughts?  Is this related to LUCENE-413?  Shall I re-apply all those
> patches (if they still work) and give it another try?
>
>         Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: NearSpans issue

Posted by Paul Elschot <pa...@xs4all.nl>.
On Friday 27 January 2006 16:33, Erik Hatcher wrote:
> One of the engineers decided to remove that exception to see what  
> effect it had on our particular troubling query:
> 
> Index: src/java/org/apache/lucene/search/spans/NearSpans.java
> ===================================================================
> --- src/java/org/apache/lucene/search/spans/NearSpans.java       
> (revision 372606)
> +++ src/java/org/apache/lucene/search/spans/NearSpans.java       
> (working copy)
> @@ -288,7 +288,8 @@
>          // When queue is empty and checkSlop() and ordered there is  
> a match.
>        }
>      }
> -    throw new RuntimeException("Unexpected: ordered");
> +    //throw new RuntimeException("Unexpected: ordered");
> +    return false;
>    }
> 
>    private void listToQueue() {
> 
> 
> And the error with that query went away and the results were accurate.

A pragmatic approach.
 
> I realize this isn't the long-term fix, but wanted to report where  
> things stand.

At the time this code fixed another bug in the span logic.
I vaguely recall that I was not certain what should be done
when the execution reaches that particular point, and I put the exception
in to make sure that something would be done about it when needed.
So it is actually unfinished code, and next time I'll try harder not to
use that way of coding again.

> 
> I'm still going to patch things up locally with those JIRA patches  
> and see where that takes it.  We've not had success building a  
> generic index that we can share that duplicates this issue,  
> unfortunately.

I hope those patches solve the problem.

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: NearSpans issue

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
One of the engineers decided to remove that exception to see what  
effect it had on our particular troubling query:

Index: src/java/org/apache/lucene/search/spans/NearSpans.java
===================================================================
--- src/java/org/apache/lucene/search/spans/NearSpans.java       
(revision 372606)
+++ src/java/org/apache/lucene/search/spans/NearSpans.java       
(working copy)
@@ -288,7 +288,8 @@
         // When queue is empty and checkSlop() and ordered there is  
a match.
       }
     }
-    throw new RuntimeException("Unexpected: ordered");
+    //throw new RuntimeException("Unexpected: ordered");
+    return false;
   }

   private void listToQueue() {


And the error with that query went away and the results were accurate.

I realize this isn't the long-term fix, but wanted to report where  
things stand.

I'm still going to patch things up locally with those JIRA patches  
and see where that takes it.  We've not had success building a  
generic index that we can share that duplicates this issue,  
unfortunately.

	Erik





On Jan 25, 2006, at 6:14 AM, Erik Hatcher wrote:

> In using SpanQuery in a sophisticated way, we've experienced an  
> issue with NearSpans giving the following exception:
>
> java.lang.RuntimeException: Unexpected: ordered
>        at  
> org.apache.lucene.search.spans.NearSpans.firstNonOrderedNextToPartialL 
> ist(NearSpans.java:291)
>        at org.apache.lucene.search.spans.NearSpans.next 
> (NearSpans.java:183)
>        at org.apache.lucene.search.spans.SpanScorer.next 
> (SpanScorer.java:50)
>        at org.apache.lucene.search.BooleanScorer$SubScorer.<init> 
> (BooleanScorer.java:48)
>        at org.apache.lucene.search.BooleanScorer.add 
> (BooleanScorer.java:76)
>        at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer 
> (BooleanQuery.java:274)
>        at org.apache.lucene.search.IndexSearcher.search 
> (IndexSearcher.java:98)
>        at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
>        at org.apache.lucene.search.Hits.<init>(Hits.java:44)
>        at org.apache.lucene.search.Searcher.search(Searcher.java:44)
>        at org.apache.lucene.search.Searcher.search(Searcher.java:36)
>
> This is using the current trunk code.   I believe this may be  
> related to http://issues.apache.org/jira/browse/LUCENE-413
>
> Unfortunately I cannot share the exact index or query, but the  
> query ends up being a BooleanQuery with about 10 clauses, some  
> being simple TermQuery's, and others being SpanNearQuery's nested  
> with SpanRegexQuery's and SpanTermQuery's.
>
> Thoughts?  Is this related to LUCENE-413?  Shall I re-apply all  
> those patches (if they still work) and give it another try?
>
> 	Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org