You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Robert Pohl <ro...@gmail.com> on 2009/12/22 12:18:59 UTC

Boost index position?

Hi all,

Is there a way to boost index position?
I want to achieve a result that is basically sorted by the latest (added 
to the index) on top, as well as text relevance.
For example I search for tiger woods, and don't want articles from last 
year (in the first places).

Thanks,
Rob


Suggest Search Terms

Posted by Heath Aldrich <ha...@aes2.com>.
Hello all...

I'm looking for some guidance on how to get suggested search terms going from the lucene.net perspective.
I have seen a few concepts using SOLR, but I'm trying to figure out how to make it happen using lucene.

I would like to be able to suggest the rest of a search term, much as Google does when searching.  I can figure the AJAX part of displaying the results no problem, but I really don't know how to make lucene provide the results that I should be displaying.

I "think" it is done using n-grams, but that's really about as far as I have found thus far.

Any guidance is appreciated...

Thanks.
Heath Aldrich



Re: Boost index position?

Posted by Robert Pohl <ro...@gmail.com>.
Hi Eran,

Thank you for your answer!
I solved it with a Sort on the creation date, as you suggested. It is 
not a super solution, since the relevance is ignored, but it works 
pretty good :)

Thanks,
Rob

Eran Sevi wrote:
> Hi,
> Easiest solution will be to add a field with creation/insertion date. you
> can use DateField static methods for converting the date to a string for
> indexing.
> You can also use NumericFields if you're using version 2.9.1 which will
> allow for fast date range searches.
>
> When runing a search, pass a sort object on that date field.
>
> Try searching the archives (.net and java ones) - this question was
> discussed a few times before.
>
> Eran.
>
> On Tue, Dec 22, 2009 at 1:18 PM, Robert Pohl <ro...@gmail.com> wrote:
>
>   
>> Hi all,
>>
>> Is there a way to boost index position?
>> I want to achieve a result that is basically sorted by the latest (added to
>> the index) on top, as well as text relevance.
>> For example I search for tiger woods, and don't want articles from last
>> year (in the first places).
>>
>> Thanks,
>> Rob
>>
>>
>>     
>
>   


Re: Boost index position?

Posted by Eran Sevi <er...@gmail.com>.
Hi,
Easiest solution will be to add a field with creation/insertion date. you
can use DateField static methods for converting the date to a string for
indexing.
You can also use NumericFields if you're using version 2.9.1 which will
allow for fast date range searches.

When runing a search, pass a sort object on that date field.

Try searching the archives (.net and java ones) - this question was
discussed a few times before.

Eran.

On Tue, Dec 22, 2009 at 1:18 PM, Robert Pohl <ro...@gmail.com> wrote:

> Hi all,
>
> Is there a way to boost index position?
> I want to achieve a result that is basically sorted by the latest (added to
> the index) on top, as well as text relevance.
> For example I search for tiger woods, and don't want articles from last
> year (in the first places).
>
> Thanks,
> Rob
>
>

Re: Boost index position?

Posted by Todd Carrico <To...@match.com>.
>>> dateBoost = 1 + (DateBoostCoeff - 1)/(distance.TotalDays + 1)

Another technique is to use distance from a static date.  The number  
grows with time, regardless of when the document is indexed.



Sent from my iPhone.
-- Please pardon any misspellings, T-SQL and C# are my primary  
languages, English is my third.

On Dec 22, 2009, at 6:25 AM, "Pavlo Zahozhenko" <pavlozahozhenko@gmail.com 
 > wrote:

> Yes, surely you can boost the inverted "distance", it is the simplest
> formula for such a boost. My formula does almost the same thing,  
> only with
> configurable coefficient, plus it treats items with large "distance"  
> almost
> the same (so, it heavily boosts only items, created within a few  
> recent
> days).
>
> As for incremental indexing, I use it as well, and it's not so easy  
> to deal
> with it. I use the following method:
> During incremental indexing, I take the date of latest full indexing  
> as a
> basis and boost all newly created items in a way that all of them have
> higher date boost than those already indexed (the greater is the  
> "distance"
> between last indexing date and now, the bigger is the boost value).   
> So,
> date boost value increases and increases until I run full indexing  
> again,
> which normalizes the date boost of all items. This method works only  
> if you
> run full indexing rather frequently, once a few days or at least  
> once a
> week. Otherwise, date boost would quickly become too big and thus will
> overshadow other boosts, that might decrease overall search relevance.
>
> Regards,
>         Pavlo Zahozhenko
>
> On Tue, Dec 22, 2009 at 1:58 PM, Robert Pohl <ro...@gmail.com>  
> wrote:
>
>> Thanks Pavlo!
>>
>> And yes, I use incremental indexing =)
>> I have a create date, but can I boost the (inverted) "distance"  
>> between
>> DateTime.Now and CreateDate?
>>
>> That is, boost the  most with dates that  have the lowest distance.
>>
>> Thanks,
>> Rob
>>
>>
>>
>> Pavlo Zahozhenko wrote:
>>
>>> You can analyze creation date of each indexed item and boost the  
>>> resulting
>>> Lucene documents accordingly. For example, I'm using the following
>>> formula:
>>> " dateBoost = 1 + (DateBoostCoeff - 1)/(distance.TotalDays + 1) ",  
>>> where
>>> distance.TotalDays is the difference in days between  
>>> DateTime.Today and
>>> item
>>> creation date. Then simply " doc.SetBoost(dateBoost) ". This  
>>> formula fits
>>> my
>>> needs, but you'll probably have to experiment with different  
>>> formulas a
>>> bit
>>> to find out what fits your model.
>>>
>>> Now all this gets much more complex if you're using incremental  
>>> indexing,
>>> but there's a solution for it as well.
>>>
>>> Regards,
>>>         Pavlo Zahozhenko
>>>
>>> On Tue, Dec 22, 2009 at 1:18 PM, Robert Pohl <ro...@gmail.com>  
>>> wrote:
>>>
>>>
>>>
>>>> Hi all,
>>>>
>>>> Is there a way to boost index position?
>>>> I want to achieve a result that is basically sorted by the latest  
>>>> (added
>>>> to
>>>> the index) on top, as well as text relevance.
>>>> For example I search for tiger woods, and don't want articles  
>>>> from last
>>>> year (in the first places).
>>>>
>>>> Thanks,
>>>> Rob
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>

Re: Boost index position?

Posted by Pavlo Zahozhenko <pa...@gmail.com>.
Yes, surely you can boost the inverted "distance", it is the simplest
formula for such a boost. My formula does almost the same thing, only with
configurable coefficient, plus it treats items with large "distance" almost
the same (so, it heavily boosts only items, created within a few recent
days).

As for incremental indexing, I use it as well, and it's not so easy to deal
with it. I use the following method:
During incremental indexing, I take the date of latest full indexing as a
basis and boost all newly created items in a way that all of them have
higher date boost than those already indexed (the greater is the "distance"
between last indexing date and now, the bigger is the boost value).  So,
date boost value increases and increases until I run full indexing again,
which normalizes the date boost of all items. This method works only if you
run full indexing rather frequently, once a few days or at least once a
week. Otherwise, date boost would quickly become too big and thus will
overshadow other boosts, that might decrease overall search relevance.

Regards,
         Pavlo Zahozhenko

On Tue, Dec 22, 2009 at 1:58 PM, Robert Pohl <ro...@gmail.com> wrote:

> Thanks Pavlo!
>
> And yes, I use incremental indexing =)
> I have a create date, but can I boost the (inverted) "distance" between
> DateTime.Now and CreateDate?
>
> That is, boost the  most with dates that  have the lowest distance.
>
> Thanks,
> Rob
>
>
>
> Pavlo Zahozhenko wrote:
>
>> You can analyze creation date of each indexed item and boost the resulting
>> Lucene documents accordingly. For example, I'm using the following
>> formula:
>> " dateBoost = 1 + (DateBoostCoeff - 1)/(distance.TotalDays + 1) ", where
>> distance.TotalDays is the difference in days between DateTime.Today and
>> item
>> creation date. Then simply " doc.SetBoost(dateBoost) ". This formula fits
>> my
>> needs, but you'll probably have to experiment with different formulas a
>> bit
>> to find out what fits your model.
>>
>> Now all this gets much more complex if you're using incremental indexing,
>> but there's a solution for it as well.
>>
>> Regards,
>>          Pavlo Zahozhenko
>>
>> On Tue, Dec 22, 2009 at 1:18 PM, Robert Pohl <ro...@gmail.com> wrote:
>>
>>
>>
>>> Hi all,
>>>
>>> Is there a way to boost index position?
>>> I want to achieve a result that is basically sorted by the latest (added
>>> to
>>> the index) on top, as well as text relevance.
>>> For example I search for tiger woods, and don't want articles from last
>>> year (in the first places).
>>>
>>> Thanks,
>>> Rob
>>>
>>>
>>>
>>>
>>
>>
>>
>
>

Re: Boost index position?

Posted by Robert Pohl <ro...@gmail.com>.
Thanks Pavlo!

And yes, I use incremental indexing =)
I have a create date, but can I boost the (inverted) "distance" between 
DateTime.Now and CreateDate?

That is, boost the  most with dates that  have the lowest distance.

Thanks,
Rob
 

Pavlo Zahozhenko wrote:
> You can analyze creation date of each indexed item and boost the resulting
> Lucene documents accordingly. For example, I'm using the following formula:
> " dateBoost = 1 + (DateBoostCoeff - 1)/(distance.TotalDays + 1) ", where
> distance.TotalDays is the difference in days between DateTime.Today and item
> creation date. Then simply " doc.SetBoost(dateBoost) ". This formula fits my
> needs, but you'll probably have to experiment with different formulas a bit
> to find out what fits your model.
>
> Now all this gets much more complex if you're using incremental indexing,
> but there's a solution for it as well.
>
> Regards,
>           Pavlo Zahozhenko
>
> On Tue, Dec 22, 2009 at 1:18 PM, Robert Pohl <ro...@gmail.com> wrote:
>
>   
>> Hi all,
>>
>> Is there a way to boost index position?
>> I want to achieve a result that is basically sorted by the latest (added to
>> the index) on top, as well as text relevance.
>> For example I search for tiger woods, and don't want articles from last
>> year (in the first places).
>>
>> Thanks,
>> Rob
>>
>>
>>     
>
>   


Re: Boost index position?

Posted by Pavlo Zahozhenko <pa...@gmail.com>.
You can analyze creation date of each indexed item and boost the resulting
Lucene documents accordingly. For example, I'm using the following formula:
" dateBoost = 1 + (DateBoostCoeff - 1)/(distance.TotalDays + 1) ", where
distance.TotalDays is the difference in days between DateTime.Today and item
creation date. Then simply " doc.SetBoost(dateBoost) ". This formula fits my
needs, but you'll probably have to experiment with different formulas a bit
to find out what fits your model.

Now all this gets much more complex if you're using incremental indexing,
but there's a solution for it as well.

Regards,
          Pavlo Zahozhenko

On Tue, Dec 22, 2009 at 1:18 PM, Robert Pohl <ro...@gmail.com> wrote:

> Hi all,
>
> Is there a way to boost index position?
> I want to achieve a result that is basically sorted by the latest (added to
> the index) on top, as well as text relevance.
> For example I search for tiger woods, and don't want articles from last
> year (in the first places).
>
> Thanks,
> Rob
>
>