You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Naranjo, Pedro" <Pe...@stercomm.com> on 2009/07/23 23:44:34 UTC
A question about the relevancy
Hi there,
I have a question
we have two querys which only different is the fact that Query_1 includes phrase queries where Query_2 has the phrase query but converted into a Boolean query.
When each query is executed, Query_1 gives a relevancy of 1.0 and Query_2 gives one of 0.34. The question is why? Wouldnt it the fact that each phrase query is now rewritten as a Boolean query give you a higher ranking as expected because it is matching every keyword in any order?
Please let me know if you see anything else that escaped me.
Sincerely,
Pedro Naranjo
***START********************QUERY_1**************************************
(+prod.MIC.MIProductName:rotarix^0.01
+((((
prod.MI.MITitle:"yyy yyi zzz"
prod.MI.MITitle:"latex allergies allergi"
prod.MI.MITitle:"administration administr"
()
prod.MI.MITitle:rotarix
()
prod.MI.MITitle:"individuals individu"
()
prod.MI.MITitle:"yyy yyi"
prod.MI.MITitle:zzz
(
prod.MI.MITitle:"yyy yyi zzz"
prod.MI.MITitle:"administration administr"
prod.MI.MITitle:rotarix
prod.MI.MITitle:"individuals individu"
prod.MI.MITitle:"yyy yyi"
prod.MI.MITitle:zzz))^15.0)
((
prod.MIFC.MIFileContent:"yyy yyi zzz"
prod.MIFC.MIFileContent:"latex allergies allergi"
prod.MIFC.MIFileContent:"administration administr"
()
prod.MIFC.MIFileContent:rotarix
()
prod.MIFC.MIFileContent:"individuals individu"
()
prod.MIFC.MIFileContent:"yyy yyi"
prod.MIFC.MIFileContent:zzz
(
prod.MIFC.MIFileContent:"yyy yyi zzz"
prod.MIFC.MIFileContent:"administration administr"
prod.MIFC.MIFileContent:rotarix
prod.MIFC.MIFileContent:"individuals individu"
prod.MIFC.MIFileContent:"yyy yyi"
prod.MIFC.MIFileContent:zzz))^1.5)
((
prod.MIC.MIKeyword:"yyy yyi zzz"
prod.MIC.MIKeyword:"latex allergies allergi"
prod.MIC.MIKeyword:"administration administr"
()
prod.MIC.MIKeyword:rotarix
()
prod.MIC.MIKeyword:"individuals individu"
()
prod.MIC.MIKeyword:"yyy yyi"
prod.MIC.MIKeyword:zzz
(
prod.MIC.MIKeyword:"yyy yyi zzz"
prod.MIC.MIKeyword:"administration administr"
prod.MIC.MIKeyword:rotarix
prod.MIC.MIKeyword:"individuals individu"
prod.MIC.MIKeyword:"yyy yyi"
prod.MIC.MIKeyword:zzz))^10.0))))
***END**********************QUERY_1**************************************
***START********************QUERY_2**************************************
(+prod.MIC.MIProductName:rotarix^0.01
+((((
(+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi +prod.MI.MITitle:zzz)
(+prod.MI.MITitle:latex +prod.MI.MITitle:allergies +prod.MI.MITitle:allergi)
(+prod.MI.MITitle:administration +prod.MI.MITitle:administr)
()
prod.MI.MITitle:rotarix
()
(+prod.MI.MITitle:individuals +prod.MI.MITitle:individu)
()
(+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi)
prod.MI.MITitle:zzz
(
(+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi +prod.MI.MITitle:zzz)
(+prod.MI.MITitle:administration +prod.MI.MITitle:administr)
prod.MI.MITitle:rotarix
(+prod.MI.MITitle:individuals +prod.MI.MITitle:individu)
(+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi)
prod.MI.MITitle:zzz))^15.0)
((
(+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi +prod.MIFC.MIFileContent:zzz)
(+prod.MIFC.MIFileContent:latex +prod.MIFC.MIFileContent:allergies +prod.MIFC.MIFileContent:allergi)
(+prod.MIFC.MIFileContent:administration +prod.MIFC.MIFileContent:administr)
()
prod.MIFC.MIFileContent:rotarix
()
(+prod.MIFC.MIFileContent:individuals +prod.MIFC.MIFileContent:individu)
()
(+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi)
prod.MIFC.MIFileContent:zzz
(
(+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi +prod.MIFC.MIFileContent:zzz)
(+prod.MIFC.MIFileContent:administration +prod.MIFC.MIFileContent:administr)
prod.MIFC.MIFileContent:rotarix
(+prod.MIFC.MIFileContent:individuals +prod.MIFC.MIFileContent:individu)
(+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi)
prod.MIFC.MIFileContent:zzz))^1.5)
((
(+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi +prod.MIC.MIKeyword:zzz)
(+prod.MIC.MIKeyword:latex +prod.MIC.MIKeyword:allergies +prod.MIC.MIKeyword:allergi)
(+prod.MIC.MIKeyword:administration +prod.MIC.MIKeyword:administr)
()
prod.MIC.MIKeyword:rotarix
()
(+prod.MIC.MIKeyword:individuals +prod.MIC.MIKeyword:individu)
()
(+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi)
prod.MIC.MIKeyword:zzz
(
(+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi +prod.MIC.MIKeyword:zzz)
(+prod.MIC.MIKeyword:administration +prod.MIC.MIKeyword:administr)
prod.MIC.MIKeyword:rotarix
(+prod.MIC.MIKeyword:individuals +prod.MIC.MIKeyword:individu)
(+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi)
prod.MIC.MIKeyword:zzz))^10.0))))
***END**********************QUERY_2**************************************
RE: A question about the relevancy
Posted by "Naranjo, Pedro" <Pe...@stercomm.com>.
Folks,
Thank you so much for your replay. We will share this with management.
-Pedro
-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com]
Sent: Thu 7/23/2009 4:41 PM
To: java-user@lucene.apache.org
Subject: Re: A question about the relevancy
Also, see http://wiki.apache.org/lucene-java/ScoresAsPercentages. The
relevancy here is that comparing scores across different queries is fairly
meaningless, even if you *do* know how that score was arrived at...
Best
Erick
On Thu, Jul 23, 2009 at 6:17 PM, Otis Gospodnetic <
otis_gospodnetic@yahoo.com> wrote:
> Hi Pedro,
>
> Lucene's Explanation will show you all the juicy details:
>
> http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Scorer.html#explain(int)<http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Scorer.html#explain%28int%29>
>
> But with a query like that, I'm not sure if you'll be able to follow
> everything. Maybe pick a super simple pair of queries instead, and look at
> Explanation for those simple queries to see what's happening and how they
> are being scored.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
> > From: "Naranjo, Pedro" <Pe...@stercomm.com>
> > To: java-user@lucene.apache.org
> > Sent: Thursday, July 23, 2009 5:44:34 PM
> > Subject: A question about the relevancy
> >
> > Hi there,
> >
> > I have a question. we have two querys which only different is the fact
> that
> > Query_1 includes phrase queries where Query_2 has the phrase query but
> > converted into a Boolean query.
> >
> > When each query is executed, Query_1 gives a relevancy of 1.0 and Query_2
> gives
> > one of 0.34. The question is why? Wouldn't it the fact that each phrase
> query is
> > now rewritten as a Boolean query give you a higher ranking as expected
> because
> > it is matching every keyword in any order?
> >
> > Please let me know if you see anything else that escaped me.
> >
> > Sincerely,
> >
> >
> > Pedro Naranjo
> >
> > ***START********************QUERY_1**************************************
> > (+prod.MIC.MIProductName:rotarix^0.01
> > +((((
> > prod.MI.MITitle:"yyy yyi zzz"
> > prod.MI.MITitle:"latex allergies allergi"
> > prod.MI.MITitle:"administration administr"
> > ()
> > prod.MI.MITitle:rotarix
> > ()
> > prod.MI.MITitle:"individuals individu"
> > ()
> > prod.MI.MITitle:"yyy yyi"
> > prod.MI.MITitle:zzz
> > (
> > prod.MI.MITitle:"yyy yyi zzz"
> > prod.MI.MITitle:"administration administr"
> > prod.MI.MITitle:rotarix
> > prod.MI.MITitle:"individuals individu"
> > prod.MI.MITitle:"yyy yyi"
> > prod.MI.MITitle:zzz))^15.0)
> > ((
> > prod.MIFC.MIFileContent:"yyy yyi zzz"
> > prod.MIFC.MIFileContent:"latex allergies allergi"
> > prod.MIFC.MIFileContent:"administration administr"
> > ()
> > prod.MIFC.MIFileContent:rotarix
> > ()
> > prod.MIFC.MIFileContent:"individuals individu"
> > ()
> > prod.MIFC.MIFileContent:"yyy yyi"
> > prod.MIFC.MIFileContent:zzz
> > (
> > prod.MIFC.MIFileContent:"yyy yyi zzz"
> > prod.MIFC.MIFileContent:"administration administr"
> > prod.MIFC.MIFileContent:rotarix
> > prod.MIFC.MIFileContent:"individuals individu"
> > prod.MIFC.MIFileContent:"yyy yyi"
> > prod.MIFC.MIFileContent:zzz))^1.5)
> > ((
> > prod.MIC.MIKeyword:"yyy yyi zzz"
> > prod.MIC.MIKeyword:"latex allergies allergi"
> > prod.MIC.MIKeyword:"administration administr"
> > ()
> > prod.MIC.MIKeyword:rotarix
> > ()
> > prod.MIC.MIKeyword:"individuals individu"
> > ()
> > prod.MIC.MIKeyword:"yyy yyi"
> > prod.MIC.MIKeyword:zzz
> > (
> > prod.MIC.MIKeyword:"yyy yyi zzz"
> > prod.MIC.MIKeyword:"administration administr"
> > prod.MIC.MIKeyword:rotarix
> > prod.MIC.MIKeyword:"individuals individu"
> > prod.MIC.MIKeyword:"yyy yyi"
> > prod.MIC.MIKeyword:zzz))^10.0))))
> > ***END**********************QUERY_1**************************************
> > ***START********************QUERY_2**************************************
> > (+prod.MIC.MIProductName:rotarix^0.01
> > +((((
> > (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi +prod.MI.MITitle:zzz)
> > (+prod.MI.MITitle:latex +prod.MI.MITitle:allergies
> +prod.MI.MITitle:allergi)
> >
> > (+prod.MI.MITitle:administration +prod.MI.MITitle:administr)
> > ()
> > prod.MI.MITitle:rotarix
> > ()
> > (+prod.MI.MITitle:individuals +prod.MI.MITitle:individu)
> > ()
> > (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi)
> > prod.MI.MITitle:zzz
> > (
> > (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi +prod.MI.MITitle:zzz)
> > (+prod.MI.MITitle:administration +prod.MI.MITitle:administr)
> > prod.MI.MITitle:rotarix
> > (+prod.MI.MITitle:individuals +prod.MI.MITitle:individu)
> > (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi)
> > prod.MI.MITitle:zzz))^15.0)
> > ((
> > (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi
> > +prod.MIFC.MIFileContent:zzz)
> > (+prod.MIFC.MIFileContent:latex +prod.MIFC.MIFileContent:allergies
> > +prod.MIFC.MIFileContent:allergi)
> > (+prod.MIFC.MIFileContent:administration
> +prod.MIFC.MIFileContent:administr)
> >
> > ()
> > prod.MIFC.MIFileContent:rotarix
> > ()
> > (+prod.MIFC.MIFileContent:individuals
> +prod.MIFC.MIFileContent:individu)
> > ()
> > (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi)
> > prod.MIFC.MIFileContent:zzz
> > (
> > (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi
> > +prod.MIFC.MIFileContent:zzz)
> > (+prod.MIFC.MIFileContent:administration
> +prod.MIFC.MIFileContent:administr)
> >
> > prod.MIFC.MIFileContent:rotarix
> > (+prod.MIFC.MIFileContent:individuals
> +prod.MIFC.MIFileContent:individu)
> > (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi)
> > prod.MIFC.MIFileContent:zzz))^1.5)
> > ((
> > (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi
> +prod.MIC.MIKeyword:zzz)
> > (+prod.MIC.MIKeyword:latex +prod.MIC.MIKeyword:allergies
> > +prod.MIC.MIKeyword:allergi)
> > (+prod.MIC.MIKeyword:administration +prod.MIC.MIKeyword:administr)
> > ()
> > prod.MIC.MIKeyword:rotarix
> > ()
> > (+prod.MIC.MIKeyword:individuals +prod.MIC.MIKeyword:individu)
> > ()
> > (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi)
> > prod.MIC.MIKeyword:zzz
> > (
> > (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi
> +prod.MIC.MIKeyword:zzz)
> > (+prod.MIC.MIKeyword:administration +prod.MIC.MIKeyword:administr)
> > prod.MIC.MIKeyword:rotarix
> > (+prod.MIC.MIKeyword:individuals +prod.MIC.MIKeyword:individu)
> > (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi)
> > prod.MIC.MIKeyword:zzz))^10.0))))
> > ***END**********************QUERY_2**************************************
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: A question about the relevancy
Posted by Erick Erickson <er...@gmail.com>.
Also, see http://wiki.apache.org/lucene-java/ScoresAsPercentages. The
relevancy here is that comparing scores across different queries is fairly
meaningless, even if you *do* know how that score was arrived at...
Best
Erick
On Thu, Jul 23, 2009 at 6:17 PM, Otis Gospodnetic <
otis_gospodnetic@yahoo.com> wrote:
> Hi Pedro,
>
> Lucene's Explanation will show you all the juicy details:
>
> http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Scorer.html#explain(int)<http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Scorer.html#explain%28int%29>
>
> But with a query like that, I'm not sure if you'll be able to follow
> everything. Maybe pick a super simple pair of queries instead, and look at
> Explanation for those simple queries to see what's happening and how they
> are being scored.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
> > From: "Naranjo, Pedro" <Pe...@stercomm.com>
> > To: java-user@lucene.apache.org
> > Sent: Thursday, July 23, 2009 5:44:34 PM
> > Subject: A question about the relevancy
> >
> > Hi there,
> >
> > I have a question… we have two querys which only different is the fact
> that
> > Query_1 includes phrase queries where Query_2 has the phrase query but
> > converted into a Boolean query.
> >
> > When each query is executed, Query_1 gives a relevancy of 1.0 and Query_2
> gives
> > one of 0.34. The question is why? Wouldn’t it the fact that each phrase
> query is
> > now rewritten as a Boolean query give you a higher ranking as expected
> because
> > it is matching every keyword in any order?
> >
> > Please let me know if you see anything else that escaped me.
> >
> > Sincerely,
> >
> >
> > Pedro Naranjo
> >
> > ***START********************QUERY_1**************************************
> > (+prod.MIC.MIProductName:rotarix^0.01
> > +((((
> > prod.MI.MITitle:"yyy yyi zzz"
> > prod.MI.MITitle:"latex allergies allergi"
> > prod.MI.MITitle:"administration administr"
> > ()
> > prod.MI.MITitle:rotarix
> > ()
> > prod.MI.MITitle:"individuals individu"
> > ()
> > prod.MI.MITitle:"yyy yyi"
> > prod.MI.MITitle:zzz
> > (
> > prod.MI.MITitle:"yyy yyi zzz"
> > prod.MI.MITitle:"administration administr"
> > prod.MI.MITitle:rotarix
> > prod.MI.MITitle:"individuals individu"
> > prod.MI.MITitle:"yyy yyi"
> > prod.MI.MITitle:zzz))^15.0)
> > ((
> > prod.MIFC.MIFileContent:"yyy yyi zzz"
> > prod.MIFC.MIFileContent:"latex allergies allergi"
> > prod.MIFC.MIFileContent:"administration administr"
> > ()
> > prod.MIFC.MIFileContent:rotarix
> > ()
> > prod.MIFC.MIFileContent:"individuals individu"
> > ()
> > prod.MIFC.MIFileContent:"yyy yyi"
> > prod.MIFC.MIFileContent:zzz
> > (
> > prod.MIFC.MIFileContent:"yyy yyi zzz"
> > prod.MIFC.MIFileContent:"administration administr"
> > prod.MIFC.MIFileContent:rotarix
> > prod.MIFC.MIFileContent:"individuals individu"
> > prod.MIFC.MIFileContent:"yyy yyi"
> > prod.MIFC.MIFileContent:zzz))^1.5)
> > ((
> > prod.MIC.MIKeyword:"yyy yyi zzz"
> > prod.MIC.MIKeyword:"latex allergies allergi"
> > prod.MIC.MIKeyword:"administration administr"
> > ()
> > prod.MIC.MIKeyword:rotarix
> > ()
> > prod.MIC.MIKeyword:"individuals individu"
> > ()
> > prod.MIC.MIKeyword:"yyy yyi"
> > prod.MIC.MIKeyword:zzz
> > (
> > prod.MIC.MIKeyword:"yyy yyi zzz"
> > prod.MIC.MIKeyword:"administration administr"
> > prod.MIC.MIKeyword:rotarix
> > prod.MIC.MIKeyword:"individuals individu"
> > prod.MIC.MIKeyword:"yyy yyi"
> > prod.MIC.MIKeyword:zzz))^10.0))))
> > ***END**********************QUERY_1**************************************
> > ***START********************QUERY_2**************************************
> > (+prod.MIC.MIProductName:rotarix^0.01
> > +((((
> > (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi +prod.MI.MITitle:zzz)
> > (+prod.MI.MITitle:latex +prod.MI.MITitle:allergies
> +prod.MI.MITitle:allergi)
> >
> > (+prod.MI.MITitle:administration +prod.MI.MITitle:administr)
> > ()
> > prod.MI.MITitle:rotarix
> > ()
> > (+prod.MI.MITitle:individuals +prod.MI.MITitle:individu)
> > ()
> > (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi)
> > prod.MI.MITitle:zzz
> > (
> > (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi +prod.MI.MITitle:zzz)
> > (+prod.MI.MITitle:administration +prod.MI.MITitle:administr)
> > prod.MI.MITitle:rotarix
> > (+prod.MI.MITitle:individuals +prod.MI.MITitle:individu)
> > (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi)
> > prod.MI.MITitle:zzz))^15.0)
> > ((
> > (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi
> > +prod.MIFC.MIFileContent:zzz)
> > (+prod.MIFC.MIFileContent:latex +prod.MIFC.MIFileContent:allergies
> > +prod.MIFC.MIFileContent:allergi)
> > (+prod.MIFC.MIFileContent:administration
> +prod.MIFC.MIFileContent:administr)
> >
> > ()
> > prod.MIFC.MIFileContent:rotarix
> > ()
> > (+prod.MIFC.MIFileContent:individuals
> +prod.MIFC.MIFileContent:individu)
> > ()
> > (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi)
> > prod.MIFC.MIFileContent:zzz
> > (
> > (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi
> > +prod.MIFC.MIFileContent:zzz)
> > (+prod.MIFC.MIFileContent:administration
> +prod.MIFC.MIFileContent:administr)
> >
> > prod.MIFC.MIFileContent:rotarix
> > (+prod.MIFC.MIFileContent:individuals
> +prod.MIFC.MIFileContent:individu)
> > (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi)
> > prod.MIFC.MIFileContent:zzz))^1.5)
> > ((
> > (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi
> +prod.MIC.MIKeyword:zzz)
> > (+prod.MIC.MIKeyword:latex +prod.MIC.MIKeyword:allergies
> > +prod.MIC.MIKeyword:allergi)
> > (+prod.MIC.MIKeyword:administration +prod.MIC.MIKeyword:administr)
> > ()
> > prod.MIC.MIKeyword:rotarix
> > ()
> > (+prod.MIC.MIKeyword:individuals +prod.MIC.MIKeyword:individu)
> > ()
> > (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi)
> > prod.MIC.MIKeyword:zzz
> > (
> > (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi
> +prod.MIC.MIKeyword:zzz)
> > (+prod.MIC.MIKeyword:administration +prod.MIC.MIKeyword:administr)
> > prod.MIC.MIKeyword:rotarix
> > (+prod.MIC.MIKeyword:individuals +prod.MIC.MIKeyword:individu)
> > (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi)
> > prod.MIC.MIKeyword:zzz))^10.0))))
> > ***END**********************QUERY_2**************************************
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: A question about the relevancy
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Pedro,
Lucene's Explanation will show you all the juicy details:
http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Scorer.html#explain(int)
But with a query like that, I'm not sure if you'll be able to follow everything. Maybe pick a super simple pair of queries instead, and look at Explanation for those simple queries to see what's happening and how they are being scored.
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
----- Original Message ----
> From: "Naranjo, Pedro" <Pe...@stercomm.com>
> To: java-user@lucene.apache.org
> Sent: Thursday, July 23, 2009 5:44:34 PM
> Subject: A question about the relevancy
>
> Hi there,
>
> I have a question… we have two querys which only different is the fact that
> Query_1 includes phrase queries where Query_2 has the phrase query but
> converted into a Boolean query.
>
> When each query is executed, Query_1 gives a relevancy of 1.0 and Query_2 gives
> one of 0.34. The question is why? Wouldn’t it the fact that each phrase query is
> now rewritten as a Boolean query give you a higher ranking as expected because
> it is matching every keyword in any order?
>
> Please let me know if you see anything else that escaped me.
>
> Sincerely,
>
>
> Pedro Naranjo
>
> ***START********************QUERY_1**************************************
> (+prod.MIC.MIProductName:rotarix^0.01
> +((((
> prod.MI.MITitle:"yyy yyi zzz"
> prod.MI.MITitle:"latex allergies allergi"
> prod.MI.MITitle:"administration administr"
> ()
> prod.MI.MITitle:rotarix
> ()
> prod.MI.MITitle:"individuals individu"
> ()
> prod.MI.MITitle:"yyy yyi"
> prod.MI.MITitle:zzz
> (
> prod.MI.MITitle:"yyy yyi zzz"
> prod.MI.MITitle:"administration administr"
> prod.MI.MITitle:rotarix
> prod.MI.MITitle:"individuals individu"
> prod.MI.MITitle:"yyy yyi"
> prod.MI.MITitle:zzz))^15.0)
> ((
> prod.MIFC.MIFileContent:"yyy yyi zzz"
> prod.MIFC.MIFileContent:"latex allergies allergi"
> prod.MIFC.MIFileContent:"administration administr"
> ()
> prod.MIFC.MIFileContent:rotarix
> ()
> prod.MIFC.MIFileContent:"individuals individu"
> ()
> prod.MIFC.MIFileContent:"yyy yyi"
> prod.MIFC.MIFileContent:zzz
> (
> prod.MIFC.MIFileContent:"yyy yyi zzz"
> prod.MIFC.MIFileContent:"administration administr"
> prod.MIFC.MIFileContent:rotarix
> prod.MIFC.MIFileContent:"individuals individu"
> prod.MIFC.MIFileContent:"yyy yyi"
> prod.MIFC.MIFileContent:zzz))^1.5)
> ((
> prod.MIC.MIKeyword:"yyy yyi zzz"
> prod.MIC.MIKeyword:"latex allergies allergi"
> prod.MIC.MIKeyword:"administration administr"
> ()
> prod.MIC.MIKeyword:rotarix
> ()
> prod.MIC.MIKeyword:"individuals individu"
> ()
> prod.MIC.MIKeyword:"yyy yyi"
> prod.MIC.MIKeyword:zzz
> (
> prod.MIC.MIKeyword:"yyy yyi zzz"
> prod.MIC.MIKeyword:"administration administr"
> prod.MIC.MIKeyword:rotarix
> prod.MIC.MIKeyword:"individuals individu"
> prod.MIC.MIKeyword:"yyy yyi"
> prod.MIC.MIKeyword:zzz))^10.0))))
> ***END**********************QUERY_1**************************************
> ***START********************QUERY_2**************************************
> (+prod.MIC.MIProductName:rotarix^0.01
> +((((
> (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi +prod.MI.MITitle:zzz)
> (+prod.MI.MITitle:latex +prod.MI.MITitle:allergies +prod.MI.MITitle:allergi)
>
> (+prod.MI.MITitle:administration +prod.MI.MITitle:administr)
> ()
> prod.MI.MITitle:rotarix
> ()
> (+prod.MI.MITitle:individuals +prod.MI.MITitle:individu)
> ()
> (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi)
> prod.MI.MITitle:zzz
> (
> (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi +prod.MI.MITitle:zzz)
> (+prod.MI.MITitle:administration +prod.MI.MITitle:administr)
> prod.MI.MITitle:rotarix
> (+prod.MI.MITitle:individuals +prod.MI.MITitle:individu)
> (+prod.MI.MITitle:yyy +prod.MI.MITitle:yyi)
> prod.MI.MITitle:zzz))^15.0)
> ((
> (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi
> +prod.MIFC.MIFileContent:zzz)
> (+prod.MIFC.MIFileContent:latex +prod.MIFC.MIFileContent:allergies
> +prod.MIFC.MIFileContent:allergi)
> (+prod.MIFC.MIFileContent:administration +prod.MIFC.MIFileContent:administr)
>
> ()
> prod.MIFC.MIFileContent:rotarix
> ()
> (+prod.MIFC.MIFileContent:individuals +prod.MIFC.MIFileContent:individu)
> ()
> (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi)
> prod.MIFC.MIFileContent:zzz
> (
> (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi
> +prod.MIFC.MIFileContent:zzz)
> (+prod.MIFC.MIFileContent:administration +prod.MIFC.MIFileContent:administr)
>
> prod.MIFC.MIFileContent:rotarix
> (+prod.MIFC.MIFileContent:individuals +prod.MIFC.MIFileContent:individu)
> (+prod.MIFC.MIFileContent:yyy +prod.MIFC.MIFileContent:yyi)
> prod.MIFC.MIFileContent:zzz))^1.5)
> ((
> (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi +prod.MIC.MIKeyword:zzz)
> (+prod.MIC.MIKeyword:latex +prod.MIC.MIKeyword:allergies
> +prod.MIC.MIKeyword:allergi)
> (+prod.MIC.MIKeyword:administration +prod.MIC.MIKeyword:administr)
> ()
> prod.MIC.MIKeyword:rotarix
> ()
> (+prod.MIC.MIKeyword:individuals +prod.MIC.MIKeyword:individu)
> ()
> (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi)
> prod.MIC.MIKeyword:zzz
> (
> (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi +prod.MIC.MIKeyword:zzz)
> (+prod.MIC.MIKeyword:administration +prod.MIC.MIKeyword:administr)
> prod.MIC.MIKeyword:rotarix
> (+prod.MIC.MIKeyword:individuals +prod.MIC.MIKeyword:individu)
> (+prod.MIC.MIKeyword:yyy +prod.MIC.MIKeyword:yyi)
> prod.MIC.MIKeyword:zzz))^10.0))))
> ***END**********************QUERY_2**************************************
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org