You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Matt Savona <ma...@gmail.com> on 2016/03/09 19:23:16 UTC
disable field length normalization on specific fields?
Hi all,
I am trying to understand if the following is possible:
I would like to have several fields in my index which are boosted at index
time. Because they are to be boosted at index time, their field type
requires omitNorms(false).
However, I do not want field length normalization to affect the scoring of
these fields. For example, finding the term 'baseball' (1:5 words) should
score exactly the same as (1:100 words).
There are other fields in my index which are not boosted, so
omitNorms(true) is acceptable on them. However, I do not want to broadly
disable length normalization on every single field (I have at least one
where I require it). Thus, I am not certain a custom Similarity class is
appropriate.
Is it possible to simply disable length normalization on a a field-by-field
basis, while still allowing index-time boosting?
Thank you in advance!
- Matt
Re: disable field length normalization on specific fields?
Posted by Chris Hostetter <ho...@fucit.org>.
yep, just use a customied similarity that doesn't include a length factor
when computing the norm.
If you are currently using TFIDFSimilarity (or one of it's subclasses)
then the computeNorm method delegates to a lengthNorm method, and you
can override that to return "1" for fields with a certain name regardless
of the length.
If you are currently using something else -- like BM25Similarity perhaps
-- you'll probably have to override the computeNorm method and
write a slightly longer calculation based on whatever logic is in the
computeNorm method you are currently using -- look for usages of
FieldInvertState.getLength() and remove/replace that with a fixed value.
: Date: Wed, 9 Mar 2016 13:23:16 -0500
: From: Matt Savona <ma...@gmail.com>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: disable field length normalization on specific fields?
:
: Hi all,
:
: I am trying to understand if the following is possible:
:
: I would like to have several fields in my index which are boosted at index
: time. Because they are to be boosted at index time, their field type
: requires omitNorms(false).
:
: However, I do not want field length normalization to affect the scoring of
: these fields. For example, finding the term 'baseball' (1:5 words) should
: score exactly the same as (1:100 words).
:
: There are other fields in my index which are not boosted, so
: omitNorms(true) is acceptable on them. However, I do not want to broadly
: disable length normalization on every single field (I have at least one
: where I require it). Thus, I am not certain a custom Similarity class is
: appropriate.
:
: Is it possible to simply disable length normalization on a a field-by-field
: basis, while still allowing index-time boosting?
:
: Thank you in advance!
:
: - Matt
:
-Hoss
http://www.lucidworks.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org