You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by ruchi thakur <ru...@gmail.com> on 2007/03/07 18:25:31 UTC

Query String for a phrase?

Hello,
Please suggest what should be the query String for a pharse search.
Thanks and Regards,
Ruchi

Re: Query String for a phrase?

Posted by ruchi thakur <ru...@gmail.com>.
Thanks a lot for your help. I am now using query as documented for phrase.
Regards,
Ruchi

On 3/13/07, Chris Hostetter <ho...@fucit.org> wrote:
>
>
> : ok, so does that mean i can use both q1 and q2 for phrase query ie; for
> : searching words adjacent to each other. Actually that was my only
> concern,
> : as i wanted to use q1 for phrase query, rather than q2.
> : Regards,
>
> Your example "q1" is not hte correct syntax for a phrase query .. the
> correct syntax is to put quotes arround your words.
>
> you happen to be getting phrase queries for your "q1" example because of
> the analyzer you are using.   QueryParser does one pass at parsing to
> look for special meta characters it udnerstands, and then passes the
> tokens it finds to your analyzer, if hte analyzer gives it back a
> stream of tokens it makes  phrase query out of it... because your analyzer
> splits "apaceh&lucene" into two tokens, QueryParser makes a phrase query.
>
> you should not rely on this behavior, because if at some point your
> analyzer changes (or if you are using StandardAnalyzer and it encounters a
> situation where it assumes the "&" is a legitimate interword character (it
> might in cases i can't think of off hte top of my head) you won't get a
> phrase query, you'll get a single word query.
>
> use the syntax documented to get the behavior documented: if you don't
> like that syntax, you'll need a different query parser.
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Query String for a phrase?

Posted by Chris Hostetter <ho...@fucit.org>.
: ok, so does that mean i can use both q1 and q2 for phrase query ie; for
: searching words adjacent to each other. Actually that was my only concern,
: as i wanted to use q1 for phrase query, rather than q2.
: Regards,

Your example "q1" is not hte correct syntax for a phrase query .. the
correct syntax is to put quotes arround your words.

you happen to be getting phrase queries for your "q1" example because of
the analyzer you are using.   QueryParser does one pass at parsing to
look for special meta characters it udnerstands, and then passes the
tokens it finds to your analyzer, if hte analyzer gives it back a
stream of tokens it makes  phrase query out of it... because your analyzer
splits "apaceh&lucene" into two tokens, QueryParser makes a phrase query.

you should not rely on this behavior, because if at some point your
analyzer changes (or if you are using StandardAnalyzer and it encounters a
situation where it assumes the "&" is a legitimate interword character (it
might in cases i can't think of off hte top of my head) you won't get a
phrase query, you'll get a single word query.

use the syntax documented to get the behavior documented: if you don't
like that syntax, you'll need a different query parser.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Query String for a phrase?

Posted by ruchi thakur <ru...@gmail.com>.
ok, so does that mean i can use both q1 and q2 for phrase query ie; for
searching words adjacent to each other. Actually that was my only concern,
as i wanted to use q1 for phrase query, rather than q2.
Regards,

On 3/12/07, Steffen Heinrich <lu...@atablis.com> wrote:
>
> On 11 Mar 2007 at 22:58, ruchi thakur wrote:
>
> > Thanks a lot for your help..
> > below is  a snapshot from the code, am using for search
> > org.apache.lucene.analysis.StopAnalyzer sa = new
> > org.apache.lucene.analysis.StopAnalyzer();
> > org.apache.lucene.analysis.Analyzer analyzer = sa;
> > QueryParser parser = new QueryParser(dIndexField, analyzer);
> > Query query = parser.parse(sSearchStr);
> > hits = is.search(query);
> >
> > q1: jakarta&apache  -> BooleanQurey("jakarta" OR/AND "apache")
> > q2: "jakarta apache"  -> PhraseQuery("jakarta apache")
> >
> > when i use the queries above, i get the same result. This is what i am
> also
> > wondering at.
> > For two doccuments d1="jakarta otherword apache" and d2="jakarta apache
> > otherword", both q1 and q2 find only d2. ie; query q1 is also looking
> for
> > documents with jakarta apache as one phrase
> >
> > Any ideas? I have tested it . Though i will test it again as suggested..
> > Regards,
> > Ruchi
> > On 3/11/07, Doron Cohen <DO...@il.ibm.com> wrote:
> >
> > > "ruchi thakur" <ru...@gmail.com> wrote on 11/03/2007 04:36:39:
> > >
> > > > So just wanted to make sure if
> > > >
> > > > jakarta&apache  -> jakarta apache
> > > > like
> > > > "jakarta apache"  -> jakarta apache
> > > >
> > > > ie; jakarta&apache seaches for phrase jakarta apache
> > > > Regards,
> > > > Ruchi
> > >
> > > q1: jakarta&apache  -> BooleanQurey("jakarta" OR/AND "apache")
> > > q2: "jakarta apache"  -> PhraseQuery("jakarta apache")
> > >
> > > For two doccuments d1="jakarta otherword apache" and d2="jakarta
> apache
> > > otherword", q1 would find both documents but q2 would only find d2.
> > >
> Hi Ruchi,
>
> trying out with Luke and using the StopAnalyzer, reveals that q1gets
> actually also translated to "jakarta apache".
> The syntax for the Boolean query should be (with space)
> jakarta +apche
>
> Cheers, Steffen
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Query String for a phrase?

Posted by Steffen Heinrich <lu...@atablis.com>.
On 11 Mar 2007 at 22:58, ruchi thakur wrote:

> Thanks a lot for your help..
> below is  a snapshot from the code, am using for search
> org.apache.lucene.analysis.StopAnalyzer sa = new
> org.apache.lucene.analysis.StopAnalyzer();
> org.apache.lucene.analysis.Analyzer analyzer = sa;
> QueryParser parser = new QueryParser(dIndexField, analyzer);
> Query query = parser.parse(sSearchStr);
> hits = is.search(query);
> 
> q1: jakarta&apache  -> BooleanQurey("jakarta" OR/AND "apache")
> q2: "jakarta apache"  -> PhraseQuery("jakarta apache")
> 
> when i use the queries above, i get the same result. This is what i am also
> wondering at.
> For two doccuments d1="jakarta otherword apache" and d2="jakarta apache
> otherword", both q1 and q2 find only d2. ie; query q1 is also looking for
> documents with jakarta apache as one phrase
> 
> Any ideas? I have tested it . Though i will test it again as suggested..
> Regards,
> Reena
> On 3/11/07, Doron Cohen <DO...@il.ibm.com> wrote:
> 
> > "ruchi thakur" <ru...@gmail.com> wrote on 11/03/2007 04:36:39:
> >
> > > So just wanted to make sure if
> > >
> > > jakarta&apache  -> jakarta apache
> > > like
> > > "jakarta apache"  -> jakarta apache
> > >
> > > ie; jakarta&apache seaches for phrase jakarta apache
> > > Regards,
> > > Ruchi
> >
> > q1: jakarta&apache  -> BooleanQurey("jakarta" OR/AND "apache")
> > q2: "jakarta apache"  -> PhraseQuery("jakarta apache")
> >
> > For two doccuments d1="jakarta otherword apache" and d2="jakarta apache
> > otherword", q1 would find both documents but q2 would only find d2.
> >
Hi Ruchi,

trying out with Luke and using the StopAnalyzer, reveals that q1gets 
actually also translated to "jakarta apache".
The syntax for the Boolean query should be (with space)
jakarta +apche

Cheers, Steffen


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Query String for a phrase?

Posted by ruchi thakur <ru...@gmail.com>.
Thanks a lot for your help..
below is  a snapshot from the code, am using for search
org.apache.lucene.analysis.StopAnalyzer sa = new
org.apache.lucene.analysis.StopAnalyzer();
org.apache.lucene.analysis.Analyzer analyzer = sa;
QueryParser parser = new QueryParser(dIndexField, analyzer);
Query query = parser.parse(sSearchStr);
hits = is.search(query);

q1: jakarta&apache  -> BooleanQurey("jakarta" OR/AND "apache")
q2: "jakarta apache"  -> PhraseQuery("jakarta apache")

when i use the queries above, i get the same result. This is what i am also
wondering at.
For two doccuments d1="jakarta otherword apache" and d2="jakarta apache
otherword", both q1 and q2 find only d2. ie; query q1 is also looking for
documents with jakarta apache as one phrase

Any ideas? I have tested it . Though i will test it again as suggested..
Regards,
Reena
On 3/11/07, Doron Cohen <DO...@il.ibm.com> wrote:

> "ruchi thakur" <ru...@gmail.com> wrote on 11/03/2007 04:36:39:
>
> > So just wanted to make sure if
> >
> > jakarta&apache  -> jakarta apache
> > like
> > "jakarta apache"  -> jakarta apache
> >
> > ie; jakarta&apache seaches for phrase jakarta apache
> > Regards,
> > Ruchi
>
> q1: jakarta&apache  -> BooleanQurey("jakarta" OR/AND "apache")
> q2: "jakarta apache"  -> PhraseQuery("jakarta apache")
>
> For two doccuments d1="jakarta otherword apache" and d2="jakarta apache
> otherword", q1 would find both documents but q2 would only find d2.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Query String for a phrase?

Posted by Doron Cohen <DO...@il.ibm.com>.
"ruchi thakur" <ru...@gmail.com> wrote on 11/03/2007 04:36:39:

> So just wanted to make sure if
>
> jakarta&apache  -> jakarta apache
> like
> "jakarta apache"  -> jakarta apache
>
> ie; jakarta&apache seaches for phrase jakarta apache
> Regards,
> Ruchi

q1: jakarta&apache  -> BooleanQurey("jakarta" OR/AND "apache")
q2: "jakarta apache"  -> PhraseQuery("jakarta apache")

For two doccuments d1="jakarta otherword apache" and d2="jakarta apache
otherword", q1 would find both documents but q2 would only find d2.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Query String for a phrase?

Posted by Erick Erickson <er...@gmail.com>.
It depends upon your analyzer during both index and search operations.

WhitespaceAnalyzer would do nothing to your string, and
you'd have "jakarta&apache".

StandardAnalyzer would give you two terms, "jakarta" and "apache"
and at query time this would be either jakarta AND apache
or jakarta OR apache, depending on what the default operator is.

and phrase search is something else again. that is, a
PhraseQuery with the term "jakarta apache" would only match the
two words if they were right next to each other.

You'll get a lot more and better help if you tell us what analyzers
you're using <G>...

Erick




On 3/11/07, ruchi thakur <ru...@gmail.com> wrote:
>
> am sorry , guess "*" caused confusion.
> My question is that, using jakarta&apache am able to search for jakarta
> apache, but
> was confused as no reference to this query String(jakarta&apache) could
> find
> anywhere on net.
>
> So just wanted to make sure if
>
> jakarta&apache  -> jakarta apache
> like
> "jakarta apache"  -> jakarta apache
>
> ie; jakarta&apache seaches for phrase jakarta apache
> Regards,
> Ruchi
>
>
> On 3/11/07, Doron Cohen <DO...@il.ibm.com> wrote:
> >
> > "ruchi thakur" <ru...@gmail.com> wrote on 10/03/2007 19:32:14:
> >
> > > does that mean* jakarta&apache* should search for   * jakartaapache*
> >
> > I assume '*' here is for emphasizing the query text, - this is somewhat
> > confusing because '*' is part of Lucene's query syntax for wildcard
> > search.
> > To the question - usually no, but it depends. You could write an
> analyzer
> > that would emit a token jakartaapache for input of jakarta&apache though
> > my
> > guess is that this is not the case, and jakarta&apache and jakartaapache
> > are two distinct words in your index. See the Lucene FAQ, in particular
> "
> > Why am I getting no hits / incorrect hits?", starting with its
> > recommendation to examine query.toString().
> >
> > Hope this helps,
> > Doron
> >
> > > But using *jakarta&apache* am able to search for *jakarta apache* ,
> but
> > was
> > > confused as no reference to this query String(jakarta&apache) could
> find
> > > anywhere on net.
> > >
> > > Regards,
> > > Ruchi
> > > On 3/8/07, Doron Cohen <DO...@il.ibm.com> wrote:
> > > >
> > > > Most likely the string  jakarta&apache  is analyzed as a single
> word,
> > > > both at indexing time and at search time.
> > > >
> > > > See also "AnalysisParalysis" in Lucene Wiki.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: Query String for a phrase?

Posted by ruchi thakur <ru...@gmail.com>.
am sorry , guess "*" caused confusion.
My question is that, using jakarta&apache am able to search for jakarta
apache, but
was confused as no reference to this query String(jakarta&apache) could find
anywhere on net.

So just wanted to make sure if

jakarta&apache  -> jakarta apache
like
"jakarta apache"  -> jakarta apache

ie; jakarta&apache seaches for phrase jakarta apache
Regards,
Ruchi


On 3/11/07, Doron Cohen <DO...@il.ibm.com> wrote:
>
> "ruchi thakur" <ru...@gmail.com> wrote on 10/03/2007 19:32:14:
>
> > does that mean* jakarta&apache* should search for   * jakartaapache*
>
> I assume '*' here is for emphasizing the query text, - this is somewhat
> confusing because '*' is part of Lucene's query syntax for wildcard
> search.
> To the question - usually no, but it depends. You could write an analyzer
> that would emit a token jakartaapache for input of jakarta&apache though
> my
> guess is that this is not the case, and jakarta&apache and jakartaapache
> are two distinct words in your index. See the Lucene FAQ, in particular "
> Why am I getting no hits / incorrect hits?", starting with its
> recommendation to examine query.toString().
>
> Hope this helps,
> Doron
>
> > But using *jakarta&apache* am able to search for *jakarta apache* , but
> was
> > confused as no reference to this query String(jakarta&apache) could find
> > anywhere on net.
> >
> > Regards,
> > Ruchi
> > On 3/8/07, Doron Cohen <DO...@il.ibm.com> wrote:
> > >
> > > Most likely the string  jakarta&apache  is analyzed as a single word,
> > > both at indexing time and at search time.
> > >
> > > See also "AnalysisParalysis" in Lucene Wiki.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Query String for a phrase?

Posted by Doron Cohen <DO...@il.ibm.com>.
"ruchi thakur" <ru...@gmail.com> wrote on 10/03/2007 19:32:14:

> does that mean* jakarta&apache* should search for   * jakartaapache*

I assume '*' here is for emphasizing the query text, - this is somewhat
confusing because '*' is part of Lucene's query syntax for wildcard search.
To the question - usually no, but it depends. You could write an analyzer
that would emit a token jakartaapache for input of jakarta&apache though my
guess is that this is not the case, and jakarta&apache and jakartaapache
are two distinct words in your index. See the Lucene FAQ, in particular "
Why am I getting no hits / incorrect hits?", starting with its
recommendation to examine query.toString().

Hope this helps,
Doron

> But using *jakarta&apache* am able to search for *jakarta apache* , but
was
> confused as no reference to this query String(jakarta&apache) could find
> anywhere on net.
>
> Regards,
> Ruchi
> On 3/8/07, Doron Cohen <DO...@il.ibm.com> wrote:
> >
> > Most likely the string  jakarta&apache  is analyzed as a single word,
> > both at indexing time and at search time.
> >
> > See also "AnalysisParalysis" in Lucene Wiki.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Query String for a phrase?

Posted by ruchi thakur <ru...@gmail.com>.
does that mean* jakarta&apache* should search for   * jakartaapache*
But using *jakarta&apache* am able to search for *jakarta apache* , but was
confused as no reference to this query String(jakarta&apache) could find
anywhere on net.

Regards,
Ruchi
On 3/8/07, Doron Cohen <DO...@il.ibm.com> wrote:
>
> Most likely the string  jakarta&apache  is analyzed as a single word,
> both at indexing time and at search time.
>
> See also "AnalysisParalysis" in Lucene Wiki.
>
> "ruchi thakur" <ru...@gmail.com> wrote on 07/03/2007 20:39:27:
>
> > Thanks Patrick. One more question. The info in link says to use the
> below
> > query for phrase
> > "jakarta apache"      . It works fine.
> > But when i run     jakarta&apache     also, it has the same effect, ie;
> like
> > a phrase. It works fine too. Though it is working but still am little
> > doubtful as i could not find this phase representation anywhere on net.
> So
> > am worried if  jakarta&apache , might slip somewhere.
> >
> > Regards,
> > Ruchi
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Query String for a phrase?

Posted by Doron Cohen <DO...@il.ibm.com>.
Most likely the string  jakarta&apache  is analyzed as a single word,
both at indexing time and at search time.

See also "AnalysisParalysis" in Lucene Wiki.

"ruchi thakur" <ru...@gmail.com> wrote on 07/03/2007 20:39:27:

> Thanks Patrick. One more question. The info in link says to use the below
> query for phrase
> "jakarta apache"      . It works fine.
> But when i run     jakarta&apache     also, it has the same effect, ie;
like
> a phrase. It works fine too. Though it is working but still am little
> doubtful as i could not find this phase representation anywhere on net.
So
> am worried if  jakarta&apache , might slip somewhere.
>
> Regards,
> Ruchi


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Query String for a phrase?

Posted by ruchi thakur <ru...@gmail.com>.
Thanks Patrick. One more question. The info in link says to use the below
query for phrase
"jakarta apache"      . It works fine.
But when i run     jakarta&apache     also, it has the same effect, ie; like
a phrase. It works fine too. Though it is working but still am little
doubtful as i could not find this phase representation anywhere on net. So
am worried if  jakarta&apache , might slip somewhere.

Regards,
Ruchi

On 3/8/07, Patrick Turcotte <pa...@gmail.com> wrote:
>
> Hi,
>
>
> > Please suggest what should be the query String for a pharse search.
>
>
> Did you take a look at:
> http://lucene.apache.org/java/docs/queryparsersyntax.html ?
>
> Patrick
>

Re: Query String for a phrase?

Posted by Patrick Turcotte <pa...@gmail.com>.
Hi,


> Please suggest what should be the query String for a pharse search.


Did you take a look at:
http://lucene.apache.org/java/docs/queryparsersyntax.html ?

Patrick