You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by liat oren <or...@gmail.com> on 2009/05/24 13:09:54 UTC

BoostingBooleanQuery search time is very long

Hi,
I have an index of 3 million documents.
I perform a regular search, using an analyzer and get the results within 1-2
minutes.
When I create a boostingBooleanQuery, and search within the index using a
similiarity that the scorePayload return the boosting value, the search
takes about 10 minutes.
This is done by parsing a text - each word appears once, but its boosting
value is affected by the frequencies.

Is it because I have to index the documnets using a differnt analyzer?
How can it be done?

Thanks  a lot,
Liat

Re: BoostingBooleanQuery search time is very long

Posted by liat oren <or...@gmail.com>.

It is a booleanQuery that uses the boosting:
I created a Similiarity class that returns the payload
and I create the query using the following way:
  BooleanQuery bq = new BooleanQuery();
  String[] splitWorlds = worlds.split(" ");
  for(int i = 0; i < splitWorlds.length; i++)
  {
   if(wordsWorldsFreqMap.getMap().get(word).get(Long.parseLong(splitWorlds[i]))
!= null)
   {
    double boost=
wordsWorldsFreqMap.getMap().get(word).get(Long.parseLong(splitWorlds[i]));
// gets the boost value from outside source
    BoostingTermQuery tq = new BoostingTermQuery(new Term(fieldName,
splitWorlds[i]));
    tq.setBoost((float) boost);
    bq.add(tq, BooleanClause.Occur.SHOULD);
   }
  }
Similarity:
public class WordsSimilarity extends DefaultSimilarity
{
 public WordsSimilarity()
 {
 }
 public float tf(float freq)
 {
  return super.tf(freq); // freq > 0 ? 1.0f : 0.0f;
 }
 public float scorePayload(byte[] payload, int offset, int length)
 {
  //  if(length == 1)
  //  {
  return payload[offset];
  //  }
 }
 public float scorePayload(String fieldName, byte[] payload, int offset, int
length)
 {
  //Do nothing
  return payload[offset];
 }
}
I use it since I want to give differnt weight for different terms.



2009/5/26 Grant Ingersoll <gs...@apache.org>

> What's a BoostingBooleanQuery?
>
>
> On May 24, 2009, at 7:09 AM, liat oren wrote:
>
> Hi,
>> I have an index of 3 million documents.
>> I perform a regular search, using an analyzer and get the results within
>> 1-2
>> minutes.
>> When I create a boostingBooleanQuery, and search within the index using a
>> similiarity that the scorePayload return the boosting value, the search
>> takes about 10 minutes.
>> This is done by parsing a text - each word appears once, but its boosting
>> value is affected by the frequencies.
>>
>> Is it because I have to index the documnets using a differnt analyzer?
>> How can it be done?
>>
>> Thanks  a lot,
>> Liat
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: BoostingBooleanQuery search time is very long

Posted by Grant Ingersoll <gs...@apache.org>.

What's a BoostingBooleanQuery?

On May 24, 2009, at 7:09 AM, liat oren wrote:

> Hi,
> I have an index of 3 million documents.
> I perform a regular search, using an analyzer and get the results  
> within 1-2
> minutes.
> When I create a boostingBooleanQuery, and search within the index  
> using a
> similiarity that the scorePayload return the boosting value, the  
> search
> takes about 10 minutes.
> This is done by parsing a text - each word appears once, but its  
> boosting
> value is affected by the frequencies.
>
> Is it because I have to index the documnets using a differnt analyzer?
> How can it be done?
>
> Thanks  a lot,
> Liat

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org