You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Luke Shannon <ls...@futurebrand.com> on 2005/02/17 20:44:50 UTC

Query Question

Hello;

Why won't this query find the document below?

Query:
+(type:203) +(name:*home\**)

Document (relevant fields):
Keyword<type:203>
Keyword<name:marcipan + home*>

I was hoping by escaping the * it would be treated as a string. What am I
doing wrong?

Thanks,

Luke



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Query Question

Posted by Luke Shannon <ls...@futurebrand.com>.
Thanks Erik. Option 2 sounds like the path of least resistance.

Luke
----- Original Message ----- 
From: "Erik Hatcher" <er...@ehatchersolutions.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, February 17, 2005 9:05 PM
Subject: Re: Query Question


> On Feb 17, 2005, at 5:51 PM, Luke Shannon wrote:
> > My manager is now totally stuck about being able to query data with * 
> > in it.
> 
> He's gonna have to wait a bit longer, you've got a slightly tricky 
> situation on your hands....
> 
> > WildcardQuery(new Term("name", "*home\**"));
> 
> The \* is the problem.  WildcardQuery doesn't deal with escaping like 
> you're trying.  Your query is essentially this now:
> 
> home\*
> 
> Where backslash has no special meaning at all... you're literally 
> looking for all terms that start with home followed by a backslash.  
> Two asterisks at the end really collapse into a single one logically.
> 
> > Any theories as to why the it would not match:
> >
> > Document (relevant fields):
> > Keyword<type:203>
> > Keyword<name:marcipan + home*>
> >
> > Is the \ escaping both * characters?
> 
> So, again, no escaping is being done here.  You're a bit stuck in this 
> situation because * (and ?) are special to WildcardQuery, and it does 
> no escaping.  Two options I think of:
> 
> - Build your own clone of WildcardQuery that does escaping - or 
> perhaps change the wildcard characters to something you do not index 
> and use those instead.
> 
> - Replace asterisks in the terms indexed with some other non-wildcard 
> character, then replace it on your queries as appropriate.
> 
> Erik
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Query Question

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 17, 2005, at 5:51 PM, Luke Shannon wrote:
> My manager is now totally stuck about being able to query data with * 
> in it.

He's gonna have to wait a bit longer, you've got a slightly tricky 
situation on your hands....

> WildcardQuery(new Term("name", "*home\**"));

The \* is the problem.  WildcardQuery doesn't deal with escaping like 
you're trying.  Your query is essentially this now:

	home\*

Where backslash has no special meaning at all... you're literally 
looking for all terms that start with home followed by a backslash.  
Two asterisks at the end really collapse into a single one logically.

> Any theories as to why the it would not match:
>
> Document (relevant fields):
> Keyword<type:203>
> Keyword<name:marcipan + home*>
>
> Is the \ escaping both * characters?

So, again, no escaping is being done here.  You're a bit stuck in this 
situation because * (and ?) are special to WildcardQuery, and it does 
no escaping.  Two options I think of:

	- Build your own clone of WildcardQuery that does escaping - or 
perhaps change the wildcard characters to something you do not index 
and use those instead.

	- Replace asterisks in the terms indexed with some other non-wildcard 
character, then replace it on your queries as appropriate.

Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Query Question

Posted by Luke Shannon <ls...@futurebrand.com>.
Hello;

My manager is now totally stuck about being able to query data with * in it.

Here are two queries.

TermQuery(new Term("type", "203"));
WildcardQuery(new Term("name", "*home\**"));

They are joined in a boolean query. That query gives this result when you
call the toString():

+(type:203) +(name:*home\**)

This looks right to me.

Any theories as to why the it would not match:

Document (relevant fields):
Keyword<type:203>
Keyword<name:marcipan + home*>

Is the \ escaping both * characters?

Thanks,

Luke




----- Original Message ----- 
From: "Luke Shannon" <ls...@futurebrand.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, February 17, 2005 2:44 PM
Subject: Query Question


> Hello;
>
> Why won't this query find the document below?
>
> Query:
> +(type:203) +(name:*home\**)
>
> Document (relevant fields):
> Keyword<type:203>
> Keyword<name:marcipan + home*>
>
> I was hoping by escaping the * it would be treated as a string. What am I
> doing wrong?
>
> Thanks,
>
> Luke
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Query Question

Posted by Luke Shannon <ls...@futurebrand.com>.
That is a query toString(). I created the Query using a Wildcard Query
object.

Luke

----- Original Message ----- 
From: "Erik Hatcher" <er...@ehatchersolutions.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, February 17, 2005 3:00 PM
Subject: Re: Query Question


>
> On Feb 17, 2005, at 2:44 PM, Luke Shannon wrote:
>
> > Hello;
> >
> > Why won't this query find the document below?
> >
> > Query:
> > +(type:203) +(name:*home\**)
>
> Is that what the query toString is?  Or is that what you handed to
> QueryParser?
>
> Depending on your analyzer, 203 may go away.  QueryParser doesn't
> support leading asterisks, so "*home" would fail to parse.
>
> > Document (relevant fields):
> > Keyword<type:203>
> > Keyword<name:marcipan + home*>
> >
> > I was hoping by escaping the * it would be treated as a string. What
> > am I
> > doing wrong?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Query Question

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 17, 2005, at 2:44 PM, Luke Shannon wrote:

> Hello;
>
> Why won't this query find the document below?
>
> Query:
> +(type:203) +(name:*home\**)

Is that what the query toString is?  Or is that what you handed to 
QueryParser?

Depending on your analyzer, 203 may go away.  QueryParser doesn't 
support leading asterisks, so "*home" would fail to parse.

> Document (relevant fields):
> Keyword<type:203>
> Keyword<name:marcipan + home*>
>
> I was hoping by escaping the * it would be treated as a string. What 
> am I
> doing wrong?


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org