You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Haghighi, Nariman" <Na...@workopolis.com> on 2010/02/01 22:32:29 UTC

ComplexPhraseQueryParser (Expanded Form and Boosting)

We are relying on the ComplexPhraseQueryParser for some impressive matching capabilities.

Of concern is that Wildcard Queries, of the form "quality operations providing quality food services job requirements: click here to apply for this job*", for instance, take 2-5 seconds to execute and require boosting the maxClauseCount to > 40K. I'm hard-pressed to believe that we have over 40K unique words that have 'job' as a prefix in our index so the first question is, how does one see the expanded form of this query? We've installed the latest Luke for Lucene 3 but aren't able to reproduce the same search there as it doesn't seem to support Wildcard Queries. Second concern: boosting a phrase ("java developer"^10.0) doesn't seem to be applied when you look at the result explanations when using the ComplexPhraseQueryParser - it's respected on single word queries and it's respected on phrases using the basic QueryParser.

Any ideas?




________________________________
Please consider the environment before printing this email.
Avant d'imprimer ce courriel, pensez ? l'environnement.

Re: ComplexPhraseQueryParser (Expanded Form and Boosting)

Posted by "Karsten F." <ka...@fiz-technik.de>.
Hi Nariman,

In my understanding of ComplexPhraseQueryParser this class is not longer
supported.
http://issues.apache.org/jira/browse/LUCENE-1486#action_12782254

Instead with lucene 3.1 the new
org.apache.lucene.queryParser.standard.parser.StandardSyntaxParser will do
this job.
https://issues.apache.org/jira/browse/LUCENE-1823

If I am wrong, please let me now.
In particular I am very keen to see a ComplexPhraseQueryParser that does not
throw an exception for 
author:"fred* smith" 

Best regards
   Karsten
-- 
View this message in context: http://old.nabble.com/ComplexPhraseQueryParser-%28Expanded-Form-and-Boosting%29-tp27411736p27419562.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: ComplexPhraseQueryParser (Expanded Form and Boosting)

Posted by "Haghighi, Nariman" <Na...@workopolis.com>.
I'm not able to see the boost applied even with an additional term added. 

The original query:

+(JOB_TITLE:"java developer"^15.0 TEXT:"java developer") +LANGUAGE:EN +GATEWAY:work

Modified to:

+(JOB_TITLE:"java developer"^15.0 JOB_TITLE:java TEXT:"java developer") +LANGUAGE:EN +GATEWAY:work

Produces:

39.992558 = (MATCH) time weighted query score is product of
  23.095446 = time-weight
  1.731621 = (MATCH) sum of:
    1.6057005 = (MATCH) product of:
      2.4085507 = (MATCH) sum of:
        2.2247097 = (MATCH) weight(spanNear([JOB_TITLE:java, JOB_TITLE:developer], 0, true) in 20801), product of:
          0.61919963 = queryWeight(spanNear([JOB_TITLE:java, JOB_TITLE:developer], 0, true)), product of:
            12.4461 = idf(JOB_TITLE:  developer=873 java=120)
            0.049750492 = queryNorm
          3.5928795 = (MATCH) fieldWeight(JOB_TITLE:spanNear([java, developer],0, true) in 20801), product of:
            0.57735026 = tf(phraseFreq=0.33333334)
            12.4461 = idf(JOB_TITLE:  developer=873 java=120)
            0.5 = fieldNorm(field=JOB_TITLE, doc=20801)
        0.183841 = (MATCH) weight(spanNear([TEXT:java, TEXT:developer], 0, true) in 20801), product of:
          0.50345445 = queryWeight(spanNear([TEXT:java, TEXT:developer], 0, true)), product of:
            10.119588 = idf(TEXT:  developer=1290 java=838)
            0.049750492 = queryNorm
          0.36515915 = (MATCH) fieldWeight(TEXT:spanNear([java, developer], 0, true) in 20801), product of:
            0.57735026 = tf(phraseFreq=0.33333334)
            10.119588 = idf(TEXT:  developer=1290 java=838)
            0.0625 = fieldNorm(field=TEXT, doc=20801)
      0.6666667 = coord(2/3)
    0.07157092 = (MATCH) weight(LANGUAGE:EN in 20801), product of:
      0.059671503 = queryWeight(LANGUAGE:EN), product of:
        1.1994153 = idf(docFreq=49417, maxDocs=60324)
        0.049750492 = queryNorm
      1.1994153 = (MATCH) fieldWeight(LANGUAGE:EN in 20801), product of:
        1.0 = tf(termFreq(LANGUAGE:EN)=1)
        1.1994153 = idf(docFreq=49417, maxDocs=60324)
        1.0 = fieldNorm(field=LANGUAGE, doc=20801)
    0.05434969 = (MATCH) weight(GATEWAY:work in 20801), product of:
      0.051999267 = queryWeight(GATEWAY:work), product of:
        1.0452011 = idf(docFreq=57657, maxDocs=60324)
        0.049750492 = queryNorm
      1.0452011 = (MATCH) fieldWeight(GATEWAY:work in 20801), product of:
        1.0 = tf(termFreq(GATEWAY:work)=1)
        1.0452011 = idf(docFreq=57657, maxDocs=60324)
        1.0 = fieldNorm(field=GATEWAY, doc=20801)

-----Original Message-----
From: Ahmet Arslan [mailto:iorixxx@yahoo.com] 
Sent: Tuesday, February 02, 2010 9:32 AM
To: java-user@lucene.apache.org
Subject: Re: ComplexPhraseQueryParser (Expanded Form and Boosting)

> Second concern: boosting a
> phrase ("java developer"^10.0) doesn't seem to be applied
> when you look at the result explanations when using the
> ComplexPhraseQueryParser - it's respected on single word
> queries and it's respected on phrases using the basic
> QueryParser.

I just tested and able to see "product of: 10.0 = boost" in explanations. However I added a new term to the query: "java developer"^10.0 java

It seems that the queries alone "java developer"^10.0 and "java developer" are virtually equal without any other terms.


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: ComplexPhraseQueryParser (Expanded Form and Boosting)

Posted by Ahmet Arslan <io...@yahoo.com>.
> Second concern: boosting a
> phrase ("java developer"^10.0) doesn't seem to be applied
> when you look at the result explanations when using the
> ComplexPhraseQueryParser - it's respected on single word
> queries and it's respected on phrases using the basic
> QueryParser.

I just tested and able to see "product of: 10.0 = boost" in explanations. However I added a new term to the query: "java developer"^10.0 java

It seems that the queries alone "java developer"^10.0 and "java developer" are virtually equal without any other terms.


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org