You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by C0re <bl...@hotmail.co.uk> on 2010/11/05 17:35:34 UTC

Wildcard weirdness

Hi,

I'm trying to understand what Solr is doing when a search for O'Connor and
O'Conn* is done.

The first search returns 4 results, which is fine. I would expect the second
search to return at least 4 (the same) results, however it fails to return
any.

I've debugged the query and this is the output:

Debug for O'Connor :
<str name="rawquerystring">surname:O'Connor</str>
<str name="querystring">surname:O'Connor</str>
<str name="parsedquery">PhraseQuery(surname:"o connor")</str>
<str name="parsedquery_toString">surname:"o connor"</str>

Debug for O'Conn* :
<str name="rawquerystring">surname:O'Conno*</str>
<str name="querystring">surname:O'Conno*</str>
<str name="parsedquery">surname:O'Conno*</str>
<str name="parsedquery_toString">surname:O'Conno*</str>

So as you can see the queries are different but I don't understand why Solr
changes them the way it does?

Also, searching for Conno* does work.

Thanks,
C.

-- 
View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-weirdness-tp1849362p1849362.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard weirdness

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
strange..my second guess would be that stemming could be the reason but if
your analyser(s) emit the same values you use for searching that's odd..

could you post your schema definition for the surname field?

On 5 November 2010 17:33, C0re <bl...@hotmail.co.uk> wrote:

>
> Hi Savvas,
>
> Thanks for the reply. Yep I've been trying out the Analysis tool.
>
> As you say the index does lowercase the terms.
>
> Field Name: surname
> Index Value: O'Connor
> Query Value: connor
>
> The Index Analyzer creates:
> o       connor
>
> Which the query value above will match on.
>
> However, if the query value is conno* then there is no match.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Wildcard-weirdness-tp1849362p1849680.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Wildcard weirdness

Posted by C0re <bl...@hotmail.co.uk>.
Hi Savvas,

Thanks for the reply. Yep I've been trying out the Analysis tool.

As you say the index does lowercase the terms.

Field Name: surname
Index Value: O'Connor
Query Value: connor

The Index Analyzer creates:
o	connor

Which the query value above will match on.

However, if the query value is conno* then there is no match.




-- 
View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-weirdness-tp1849362p1849680.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard weirdness

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
One place to start would be the Analysis page http://{your
machine}:{port}/solr/admin/analysis.jsp?highlight=on
There you can see exactly what happens to your query as it being moved down
the Analysis chain.

In my knowledge, no analysis is performed on wildcarded terms so my guess
would be that the analysis chain modifies (e.g. lowercases/stems) and
indexes your terms this way and you can't have a match.
If for instance, your indexed term is lowercased to o'connor and you are
searching for O'Conno* then Solr will look for any terms starting with O'Conno
and *not* o'conno .

But like mentioned above, the Analysis page is usually very helpful in
situations like that. :)

hope that helps

On 5 November 2010 16:35, C0re <bl...@hotmail.co.uk> wrote:

>
> Hi,
>
> I'm trying to understand what Solr is doing when a search for O'Connor and
> O'Conn* is done.
>
> The first search returns 4 results, which is fine. I would expect the
> second
> search to return at least 4 (the same) results, however it fails to return
> any.
>
> I've debugged the query and this is the output:
>
> Debug for O'Connor :
> <str name="rawquerystring">surname:O'Connor</str>
> <str name="querystring">surname:O'Connor</str>
> <str name="parsedquery">PhraseQuery(surname:"o connor")</str>
> <str name="parsedquery_toString">surname:"o connor"</str>
>
> Debug for O'Conn* :
> <str name="rawquerystring">surname:O'Conno*</str>
> <str name="querystring">surname:O'Conno*</str>
> <str name="parsedquery">surname:O'Conno*</str>
> <str name="parsedquery_toString">surname:O'Conno*</str>
>
> So as you can see the queries are different but I don't understand why Solr
> changes them the way it does?
>
> Also, searching for Conno* does work.
>
> Thanks,
> C.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Wildcard-weirdness-tp1849362p1849362.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Wildcard weirdness

Posted by Ahmet Arslan <io...@yahoo.com>.
> I'm trying to understand what Solr is doing when a search
> for O'Connor and
> O'Conn* is done.
> 
> The first search returns 4 results, which is fine. I would
> expect the second
> search to return at least 4 (the same) results, however it
> fails to return
> any.
> 
> I've debugged the query and this is the output:
> 
> Debug for O'Connor :
> <str
> name="rawquerystring">surname:O'Connor</str>
> <str name="querystring">surname:O'Connor</str>
> <str name="parsedquery">PhraseQuery(surname:"o
> connor")</str>
> <str name="parsedquery_toString">surname:"o
> connor"</str>
> 
> Debug for O'Conn* :
> <str
> name="rawquerystring">surname:O'Conno*</str>
> <str name="querystring">surname:O'Conno*</str>
> <str name="parsedquery">surname:O'Conno*</str>
> <str
> name="parsedquery_toString">surname:O'Conno*</str>
> 
> So as you can see the queries are different but I don't
> understand why Solr
> changes them the way it does?

Wildcard queries are not analyzed, thats the reason. Please note that analysis.jsp does not actual query parsing.