You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tomasz Kępski <to...@kepski.pl> on 2009/11/23 14:01:51 UTC
Boost document base on field length
Hi,
I would like to boost documents with longer descriptions to move down
documents with 0 length description,
I'm wondering if there is possibility to boost document basing on the
field length while searching or the only way is to store field length as
an int in a separate field while indexing?
Tom
Re: Boost document base on field length
Posted by Lance Norskog <go...@gmail.com>.
The Lucene norms, if set, are 1/number of terms in the field.
I cannot find a function that makes norms available. Yo gurus- is this
impossible, a bad idea, or just an oversight?
On Tue, Nov 24, 2009 at 6:06 AM, Tomasz Kępski <to...@kepski.pl> wrote:
> Hi,
>
>> I think i'm reading he question differently then Grant -- his suggestion
>> applies when you are searching in the description field, and don't want
>> documents with shorter descriptions to score higher when the same terms
>> match the same number of times (the default behavior of lengthNorm)
>
>> my udnerstanding is that you want documents that don't have a description
>> to score lower then documents that do -- and you might be querying against
>> completely differnet fields (description might not even be indexed)
>>
>> in that case there is no easy way to to achieve this with just the
>> description field ... the easy thing to do is to index a boolean
>> "has_description" field and then incorporate that into your query (or as the
>> input to a function query)
>
> You get my point Hoss. In my case long description = good value. And your
> intuition is amazing ;-) I do have a field which is not used in search at
> all (image url) but docs with image have for me greater value than without
> it.
>
> I would add two fields then (boolean for photo and int for description
> length) fill them up during indexation and would play with them during the
> search.
>
> Thanks,
> Tom
>
>
--
Lance Norskog
goksron@gmail.com
Re: Boost document base on field length
Posted by Tomasz Kępski <to...@kepski.pl>.
Hi,
> I think i'm reading he question differently then Grant -- his suggestion
> applies when you are searching in the description field, and don't want
> documents with shorter descriptions to score higher when the same terms
> match the same number of times (the default behavior of lengthNorm)
> my udnerstanding is that you want documents that don't have a description
> to score lower then documents that do -- and you might be querying against
> completely differnet fields (description might not even be indexed)
>
> in that case there is no easy way to to achieve this with just the
> description field ... the easy thing to do is to index a boolean
> "has_description" field and then incorporate that into your query (or as
> the input to a function query)
You get my point Hoss. In my case long description = good value. And
your intuition is amazing ;-) I do have a field which is not used in
search at all (image url) but docs with image have for me greater value
than without it.
I would add two fields then (boolean for photo and int for description
length) fill them up during indexation and would play with them during
the search.
Thanks,
Tom
Re: Boost document base on field length
Posted by Chris Hostetter <ho...@fucit.org>.
: > I would like to boost documents with longer descriptions to move down documents with 0 length description,
: > I'm wondering if there is possibility to boost document basing on the field length while searching or the only way is to store field length as an int in a separate field while indexing?
:
: Override the default Similarity (see the end of the schema.xml file)
: with your own Similarity implementation and then in that class override
: the lengthNorm() method.
I think i'm reading he question differently then Grant -- his suggestion
applies when you are searching in the description field, and don't want
documents with shorter descriptions to score higher when the same terms
match the same number of times (the default behavior of lengthNorm)
my udnerstanding is that you want documents that don't have a description
to score lower then documents that do -- and you might be querying against
completely differnet fields (description might not even be indexed)
in that case there is no easy way to to achieve this with just the
description field ... the easy thing to do is to index a boolean
"has_description" field and then incorporate that into your query (or as
the input to a function query)
-Hoss
Re: Boost document base on field length
Posted by Grant Ingersoll <gs...@apache.org>.
On Nov 23, 2009, at 8:01 AM, Tomasz Kępski wrote:
> Hi,
>
> I would like to boost documents with longer descriptions to move down documents with 0 length description,
> I'm wondering if there is possibility to boost document basing on the field length while searching or the only way is to store field length as an int in a separate field while indexing?
Override the default Similarity (see the end of the schema.xml file) with your own Similarity implementation and then in that class override the lengthNorm() method.