You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by liat oren <or...@gmail.com> on 2009/05/24 13:09:54 UTC
BoostingBooleanQuery search time is very long
Hi,
I have an index of 3 million documents.
I perform a regular search, using an analyzer and get the results within 1-2
minutes.
When I create a boostingBooleanQuery, and search within the index using a
similiarity that the scorePayload return the boosting value, the search
takes about 10 minutes.
This is done by parsing a text - each word appears once, but its boosting
value is affected by the frequencies.
Is it because I have to index the documnets using a differnt analyzer?
How can it be done?
Thanks a lot,
Liat
Re: BoostingBooleanQuery search time is very long
Posted by liat oren <or...@gmail.com>.
It is a booleanQuery that uses the boosting:
I created a Similiarity class that returns the payload
and I create the query using the following way:
BooleanQuery bq = new BooleanQuery();
String[] splitWorlds = worlds.split(" ");
for(int i = 0; i < splitWorlds.length; i++)
{
if(wordsWorldsFreqMap.getMap().get(word).get(Long.parseLong(splitWorlds[i]))
!= null)
{
double boost=
wordsWorldsFreqMap.getMap().get(word).get(Long.parseLong(splitWorlds[i]));
// gets the boost value from outside source
BoostingTermQuery tq = new BoostingTermQuery(new Term(fieldName,
splitWorlds[i]));
tq.setBoost((float) boost);
bq.add(tq, BooleanClause.Occur.SHOULD);
}
}
Similarity:
public class WordsSimilarity extends DefaultSimilarity
{
public WordsSimilarity()
{
}
public float tf(float freq)
{
return super.tf(freq); // freq > 0 ? 1.0f : 0.0f;
}
public float scorePayload(byte[] payload, int offset, int length)
{
// if(length == 1)
// {
return payload[offset];
// }
}
public float scorePayload(String fieldName, byte[] payload, int offset, int
length)
{
//Do nothing
return payload[offset];
}
}
I use it since I want to give differnt weight for different terms.
2009/5/26 Grant Ingersoll <gs...@apache.org>
> What's a BoostingBooleanQuery?
>
>
> On May 24, 2009, at 7:09 AM, liat oren wrote:
>
> Hi,
>> I have an index of 3 million documents.
>> I perform a regular search, using an analyzer and get the results within
>> 1-2
>> minutes.
>> When I create a boostingBooleanQuery, and search within the index using a
>> similiarity that the scorePayload return the boosting value, the search
>> takes about 10 minutes.
>> This is done by parsing a text - each word appears once, but its boosting
>> value is affected by the frequencies.
>>
>> Is it because I have to index the documnets using a differnt analyzer?
>> How can it be done?
>>
>> Thanks a lot,
>> Liat
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: BoostingBooleanQuery search time is very long
Posted by Grant Ingersoll <gs...@apache.org>.
What's a BoostingBooleanQuery?
On May 24, 2009, at 7:09 AM, liat oren wrote:
> Hi,
> I have an index of 3 million documents.
> I perform a regular search, using an analyzer and get the results
> within 1-2
> minutes.
> When I create a boostingBooleanQuery, and search within the index
> using a
> similiarity that the scorePayload return the boosting value, the
> search
> takes about 10 minutes.
> This is done by parsing a text - each word appears once, but its
> boosting
> value is affected by the frequencies.
>
> Is it because I have to index the documnets using a differnt analyzer?
> How can it be done?
>
> Thanks a lot,
> Liat
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org