You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by jmr <jm...@free.fr> on 2010/10/21 11:53:39 UTC
A bug in ComplexPhraseQuery ?
Hi,
We have installed ComplexPhraseQuery and since that we can see strange
behaviour in proximity search.
We have the 2 following queries:
(text:("protein digest"~50))
(text:("digest protein"~50))
Without ComplexPhraseQuery, both queries are returning 6 documents matching.
With ComplexPhraseQuery, query 1 returns 4 documents and query 2 returns 5
documents!
It seems that proximity search is broken. Is this a known problem ?
Thanks for your help.
Regards,
J-Michel
--
View this message in context: http://lucene.472066.n3.nabble.com/A-bug-in-ComplexPhraseQuery-tp1744659p1744659.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: A bug in ComplexPhraseQuery ?
Posted by jmr <jm...@free.fr>.
iorixxx wrote:
>
>> <queryParser name="complexphrase"
>> class="org.apache.solr.search.ComplexPhraseQParserPlugin">
>> <bool
>> name="inOrder">false</bool>
>> </queryParser>
>>
>
> I added this change to SOLR-1604, can you test it give us feedback?
>
>
May thanks. I'll test this quite soon and let you know.
J-Michel
--
View this message in context: http://lucene.472066.n3.nabble.com/A-bug-in-ComplexPhraseQuery-tp1744659p1757145.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: A bug in ComplexPhraseQuery ?
Posted by jmr <jm...@free.fr>.
iorixxx wrote:
>
>
> I added Terje Eggestad's fix[1], can you test it give us feedback?
>
>
Hi,
Sorry for the delay. The fix was working well but we discovered another
query crashing the parser:
a63b27/00:IC
"org.apache.lucene.search.PhraseQuery" found in phrase query string
"a63b27/00"
at
org.apache.lucene.queryParser.ComplexPhraseQueryParser$ComplexPhraseQuery.rewrite(ComplexPhraseQueryParser.java:256)
at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:311)
at org.apache.lucene.search.Query.weight(Query.java:98)
at org.apache.lucene.search.Searcher.createWeight(Searcher.java:230)
at org.apache.lucene.search.Searcher.search(Searcher.java:171)
at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
The problem is with the /
We did a fix that seems to work here is the diff:
--- ComplexPhraseQueryParser.java.org 2010-11-04 02:56:04.000000000 +0100
+++ ComplexPhraseQueryParser.java 2010-11-05 10:14:08.062500000 +0100
@@ -245,7 +245,7 @@
public Query rewrite(IndexReader reader) throws IOException {
// ArrayList spanClauses = new ArrayList();
- if (contents instanceof TermQuery) {
+ if (contents instanceof TermQuery || contents instanceof PhraseQuery)
{
return contents;
}
// Build a sequence of Span clauses arranged in a SpanNear - child
Jean-Michel
--
View this message in context: http://lucene.472066.n3.nabble.com/A-bug-in-ComplexPhraseQuery-tp1744659p2057933.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: A bug in ComplexPhraseQuery ?
Posted by Ahmet Arslan <io...@yahoo.com>.
> However, we have found that this query is crashing when
> using
> CoomplexPhraseQuery:
> "sulfur-reducing bacteria"
>
> It is due to the dash inside the phrase.
> Here is the trace:
> java.lang.IllegalArgumentException: Unknown query type
> "org.apache.lucene.search.PhraseQuery" found in phrase
> query string
> "sulfur-reducing bacteria"
I added Terje Eggestad's fix[1], can you test it give us feedback?
[1]https://issues.apache.org/jira/browse/LUCENE-1486?focusedCommentId=12900278&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12900278
Re: A bug in ComplexPhraseQuery ?
Posted by jmr <jm...@free.fr>.
iorixxx wrote:
>
>
> I added this change to SOLR-1604, can you test it give us feedback?
>
>
Hi,
Sorry for the delay.
We have tested the change and it is OK for this.
However, we have found that this query is crashing when using
CoomplexPhraseQuery:
"sulfur-reducing bacteria"
It is due to the dash inside the phrase.
Here is the trace:
java.lang.IllegalArgumentException: Unknown query type
"org.apache.lucene.search.PhraseQuery" found in phrase query string
"sulfur-reducing bacteria"
at
org.apache.lucene.queryParser.ComplexPhraseQueryParser$ComplexPhraseQuery.rewrite(ComplexPhraseQueryParser.java:290)
at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:438)
at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:311)
at org.apache.lucene.search.Query.weight(Query.java:98)
at org.apache.lucene.search.Searcher.createWeight(Searcher.java:230)
at org.apache.lucene.search.Searcher.search(Searcher.java:171)
at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
...
Regards
Jean-Michel
--
View this message in context: http://lucene.472066.n3.nabble.com/A-bug-in-ComplexPhraseQuery-tp1744659p1835918.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: A bug in ComplexPhraseQuery ?
Posted by Ahmet Arslan <io...@yahoo.com>.
> <queryParser name="complexphrase"
> class="org.apache.solr.search.ComplexPhraseQParserPlugin">
> <bool
> name="inOrder">false</bool>
> </queryParser>
>
I added this change to SOLR-1604, can you test it give us feedback?
Re: A bug in ComplexPhraseQuery ?
Posted by Ahmet Arslan <io...@yahoo.com>.
> In my opinion, ordering term in a proximity search does not
> make sense!
> So the work around for us is to generate the opposite
> search every time a
> proximity operator is used.
> not very elegant!
If you want I can make it configurable. You can define your choice in solrconfig.xml like this:
<queryParser name="complexphrase" class="org.apache.solr.search.ComplexPhraseQParserPlugin">
<bool name="inOrder">false</bool>
</queryParser>
Re: A bug in ComplexPhraseQuery ?
Posted by jmr <jm...@free.fr>.
iorixxx wrote:
>
> ComplexPhraseQuery is ordered phrase query where default Lucene's
> PhraseQuery is unordered. With ComplexPhrase order or terms are important.
>
Thanks for your answer.
With this request: (text:("protein digest"~50)) || (text:("digest
protein"~50))
I get my 6 documents.
In my opinion, ordering term in a proximity search does not make sense!
So the work around for us is to generate the opposite search every time a
proximity operator is used.
not very elegant!
Anyway, thaks again for the answer,
J-Michel
--
View this message in context: http://lucene.472066.n3.nabble.com/A-bug-in-ComplexPhraseQuery-tp1744659p1750748.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: A bug in ComplexPhraseQuery ?
Posted by Ahmet Arslan <io...@yahoo.com>.
--- On Thu, 10/21/10, jmr <jm...@free.fr> wrote:
> From: jmr <jm...@free.fr>
> Subject: A bug in ComplexPhraseQuery ?
> To: solr-user@lucene.apache.org
> Date: Thursday, October 21, 2010, 12:53 PM
>
> Hi,
>
> We have installed ComplexPhraseQuery and since that we can
> see strange
> behaviour in proximity search.
>
> We have the 2 following queries:
> (text:("protein digest"~50))
> (text:("digest protein"~50))
>
> Without ComplexPhraseQuery, both queries are returning 6
> documents matching.
> With ComplexPhraseQuery, query 1 returns 4 documents and
> query 2 returns 5
> documents!
>
> It seems that proximity search is broken. Is this a known
> problem ?
ComplexPhraseQuery is ordered phrase query where default Lucene's PhraseQuery is unordered. With ComplexPhrase order or terms are important.